oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
335 stars 73 forks source link

the discrepancy of blast records between EDTA TE lib and RepeatMasker TE lib #305

Closed leon945945 closed 1 year ago

leon945945 commented 1 year ago

Hi shujun, My purpose is to detect TE insertion in my resequencing data, then I extracted the 'unmapped mate' sequences of 299582 singletons (one mate mapped, the other unmapped) to conduct blast with TE library, however the 144 blast records of EDTA TE library was less than the 293 blast records of RepeatMasker TE library. Actually, I prefer to use EDTA due to its useful intact TE annotation, but the lower blast records made me worry about the sensibility of TE insertion detection. To my knowledge, application of EDTA and RepeatMasker to the same genome should generate highly similar TE libraries, I am confused of the discrepancy of blast results. I want to know why the discrepancy exists and would the less blast records of EDTA TE library decrease the sensibility of TE insertion detection. Thank you, here are the results of blast. EDTA.txt RepeatMasker.txt

oushujun commented 1 year ago

I have not done this before, but either way seems pretty low to me given 300k reads. Maybe you need to check the blast parameter.

Shujun

On Mon, Oct 10, 2022 at 10:59 PM leon945945 @.***> wrote:

Hi shujun, My purpose is to detect TE insertion in my resequencing data, then I extracted the 'unmapped mate' sequences of 299582 singletons (one mate mapped, the other unmapped) to conduct blast with TE library, however the 144 blast records of EDTA TE library was less than the 293 blast records of RepeatMasker TE library. Actually, I prefer to use EDTA due to its useful intact TE annotation, but the lower blast records worried me the sensibility of TE insertion detection. To my knowledge, application of EDTA and RepeatMasker to the same genome should generate highly similar TE libraries, I am confused of the discrepancy of blast results. I want to know why the discrepancy exists and would the less blast records of EDTA TE library decrease the sensibility of TE insertion detection. Thank you, here are the results of blast. EDTA.txt https://github.com/oushujun/EDTA/files/9750693/EDTA.txt RepeatMasker.txt https://github.com/oushujun/EDTA/files/9750695/RepeatMasker.txt

— Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/305, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NGASHEMRCBNZJB2RFDWCTJZRANCNFSM6AAAAAARB3EMZI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

leon945945 commented 1 year ago

Thanks for your reply. I'll try other blast parameters.