parklab / xTea

Comprehensive TE insertion identification with WGS/WES data from multiple sequencing technics
Other
102 stars 23 forks source link

Does xtea only detect some subfamilies of TEs in hg38/19? #125

Closed sidi-yang closed 3 months ago

sidi-yang commented 3 months ago

Hi Simon,

I noticed that in the rep_lib_annotation/Alu/hg19 folder, there are a series of files named as "hg19_AluJabc_copies_with_flank", but when I opened it I found it actually is ALuY subfamily sequences, not AluJ. The same situation is in LINE also. Just wondering why there is a reference like this? What does this .fa file use for?

Thank you

simoncchu commented 3 months ago

The different subfamily copies are of high sequence homology, so it will not cause alignment differences, but introduce more copies will affect the running speed, thus I kept the necessary ones.

BTW, I recommend to ask technical issues on github, thus people met the similar issue can refer to the questions, and I am happy to answer them. But if this is just your personal interests or question, you could send me email instead.