Open abcyulongwang opened 1 week ago
Hi Yulong,
Thanks for your question. But I am a little bit confused. What is the difference between those provided TE-Aid plots?
Could you explain your question again?
Yours sincerely Jiangzhao
The two TEaid pictures above are the same TE sequence. The difference is that the one below is the "TE.cons.fa" sequence after taking the consensus sequence.The TE consensus diagram below indicates that this cons.fa does not have any full-length hits, suggesting that the transposon structure is likely incomplete. Due to poor conservation of some bases in the multiple sequence alignment, they were replaced by "N" when generating the consensus sequence.
My question may seem foolish, but the reality is that only a small number of TEs have reached the Perfect and Good levels. Most of the TE clusters do not exhibit good conservation, and many of the cons.fa files contain a lot of N. Could this potentially affect my research on transposable element polymorphisms?
Should I just delete these bad results, even though it will reduce the number of transposon predictions? Best yulong
Dear Yulong,
The "TEAid" button in TEtrimmerGUI can be applied on multiple sequence alignment (MSA) or TE consensus sequence.
If it were a MSA, a consensus sequence will be generated with a threshold of 0.5. This corresponding with your plot:
When you clikc the "Cons" button, a consensus sequence will be generated with the default threshold of 0.8. You can also apply "TEAid" button based on this consensus sequence, the corresponding plot is:
Because the threshold used for consensus sequence generation is different, the "TEAid" plot also exhibited differently especially for poorly conserved TEs (like your LINE element).
You can modify the "Cons" threshold by:
Based on your Aliveiw shreenshot, I won't discard this LINE element. You can choose to lower the "Cons" threshold number and save your MSA as a HMM model.
Yours sincerely Jiangzhao
Dear Jiangzhao
I encountered this problem during manual management.
This is a TEAID of a modified multiple alignment of a LINE transposon. It looks OK. But when I run the cons step of TEtrimmer, it becomes like this, without even a complete sequence.
There are many similar examples. Some transposon cons sequences cannot even run TEAID successfully, and it displays "BLAST hit number is 0 for this sequence." I guess this means that we can completely abandon these TE sequences because they are of low quality. I also have an idea. After running TEtrimmer, I want to compare the quality of the TE library with and without manual editing. Do you have any recommended methods for quality comparison? Manual management is too time-consuming and it will be a nightmare for me.
Thank you for your previous reply, wish you happiness every day
Yulong