Closed isabelladistefano closed 7 months ago
Greetings, Isabella!
Based on the settings you used, there doesn't appear to be a species set. If no "-species" is set, the default is to use the human-specific library. In addition, I recommend not using the "-nolow" flag, and this will likely introduce many false positive results in your output.
Hello
For the purpose of our studies, we are benchmarking some TE tools including RepeatMasker. We compare the output of RepeatMasker to the the Published TAIR Transposable Elements of Arabidopsis thaliana chromosome 1.
Parameters of RepeatMasker (version 4.0.9 )
-a -s -no_is -xsmall -nolow
on the newest TAIR arabidopsis thaliana genome.https://www.arabidopsis.org/ - TAIR publishes 7135 Transposable elements in Arabidopsis thaliana Chromosome 1
When intersecting the Repeatmasker results with the TAIR results using
bedtools intersect -u -a TAIR_TEs.gff -b Repeatmasker.fas.out.gff
There are only 3951 intersections, meaning the Repeatmasker result is only representing 55.4% of the transposable elements in Arabidopsis thaliana chromsome 1. This is before looking at whether the classes/families are matching so far. Please can you help us to find an explanation for this so that we can use it to safely annotate TEs of other non-model brassicacea species.
Best wishes,
Isabella