Closed AlisaGU closed 8 months ago
Hi,
Thank you for your interest in the software.
I'm not completely sure what you're hoping to do. Are you hoping to quantify libraries at the locus rather than subfamily level? Or are you hoping to aggregate by class_id
instead?
If you don't want to do case-control with TEtranscripts
, you can just quantify each library independently with TEcount
(part of TEtranscripts
), and then combine/compare developmental stages as you wish. This is the default mode in TElocal
(i.e. does not do differential analysis, just quantify).
TEtranscripts
aggregates using the gene_id
name, but TElocal
does not aggregate and uses the transcript_id
as the distinguishing annotation. Thus, each transcript_id
name needs to be unique (at least in the current version). Thus, if you want to merge different entries into the same gene_id
, that is possible, but would not have any effect in TElocal
(and will break it since the transcript_id
is no longer unique).
I'd be happy to discuss further to know exactly what you want, but it's not clear if your proposed GTF modification would generate your desired effect.
Thanks.
Sorry to the confusing description. I want to quantify TE expression level in family and each locus level. TElocal is under preparation (each locus level) and TEcount seems to be my next work by your suggestion.
Is it ok to consider the sum of each locus TE expression count belonging to one family as the total expression count of this family? If ok, it's no need to run TEcount
Ah, I see what you mean now.
To address your latest question: yes, theoretically you can count up the total expression of a family (though this information appears to be in the class_id
section) and get to the same (perhaps with slight variation due to EM) result as if you ran TEcount
aggregating at the class_id
level.
If you actually want to modify your GTF, you would need to transfer the class_id
value to the gene_id
value, while keeping everything else the same. However, as you pointed out, you probably don't need to do so if you're already running TElocal
.
Thanks.
Thanks!
Hi, TEtranscripts can quantify TE expression in the family levels using case and control groups. However, my samples were divided into several embryonic stages and were not suitable using TEtranscripts.
Can I modify my TE gtf file (changing TE level of each locus to class level ) to work out? For example, the original gtf is like that:
Modified version is like that (merge the 3rd and 4th into one gene_id "Denovo_TE00000003"):