Open isabelladistefano opened 1 year ago
Hi Isabella,
Thanks for trying out EDTA. Did you run EDTA on only chr1 or the entire genome?
Thanks, Shujun
On Fri, Jul 28, 2023 at 9:46 AM isabelladistefano @.***> wrote:
Dear Shujun,
I hope you are well. When reading your benchmarking paper, “Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline” EDTA appears to do very well on TE prediction in the rice genome. For the purpose of our studies, we are benchmarking some TE tools including EDTA. We compare the output of EDTA to the the Published TAIR Transposable Elements of Arabidopsis thaliana chromosome 1.
This was the code for EDTA, the FASTA being the most recent TAIR genome assembly of Arabidopsis thaliana. perl $EDTA --step anno --genome $FASTA --cds $CDS --anno 1 --threads 32 --sensitive 1 --evaluate 1
TAIR publishes 7135 Transposable elements in Arabidopsis thaliana Chromosome 1 When intersecting the EDTA results with the TAIR results using
bedtools intersect -u -a TAIR_TEs.gff -b EDTA.anno.gff
There are only 3462 intersections, meaning the EDTA result is only representing 48.5% of the transposable elements in Arabidopsis thaliana chromsome 1. Please can you help us to find an explanation for this and/or improve the efficiency of EDTA so that we can use it to safely annotate TEs of other non-model brassicacea species.
Best wishes,
Isabella
— Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/373, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NGIVA2YTBPU6EBRJ6DXSO7BZANCNFSM6AAAAAA23Q3GAQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Dear Shujun,
The whole Genome, then extracted chromosome 1.
Best wishes,
Isabella
Hello Shujun, Any comments on my findings?
Best wishes,
Isabella
Hi Isabella,
THanks for your feedback. I am benchmarking on Arabidopsis and will check out this case.
Shujun
Hello,
Any luck?
Best wishes,
Isabella
Hi Isabella,
Sorry for the long delay. I evaluated the TAIR10 TE annotation and found the quality is not as high as expected. Still, I doubt the overlap between the two annotations is less than half. Can you please share with me the link to download your TAIR10 annotation?
Thanks, Shujun
Dear Shujun,
I hope you are well. When reading your benchmarking paper, “Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline” EDTA appears to do very well on TE prediction in the rice genome. For the purpose of our studies, we are benchmarking some TE tools including EDTA. We compare the output of EDTA to the the Published TAIR Transposable Elements of Arabidopsis thaliana chromosome 1.
This was the code for EDTA, the FASTA being the most recent TAIR genome assembly of Arabidopsis thaliana.
perl $EDTA --genome $FASTA --cds $CDS --anno 1 --threads 32 --sensitive 1
https://www.arabidopsis.org/ - TAIR publishes 7135 Transposable elements in Arabidopsis thaliana Chromosome 1 When intersecting the EDTA results with the TAIR results using
bedtools intersect -u -a TAIR_TEs.gff -b EDTA.anno.gff
There are only 3462 intersections, meaning the EDTA result is only representing 48.5% of the transposable elements in Arabidopsis thaliana chromsome 1. This is before looking at whether the classes/families are correct so far.
Please can you help us to find an explanation for this and/or improve the efficiency of EDTA so that we can use it to safely annotate TEs of other non-model brassicaceae species.
Best wishes,
Isabella