Closed Morriyaty closed 1 year ago
Hi Yinjia,
If I understand correctly, the third annotation is generated using EDTA's library by RepeatMasker (with -q).
Basically the final annotation is merging the homology annotation (RepeatMasker) with structural annotation (EDTA), and with homology annotated entries overlapping with structurally annotated entries removed. So there should be similar levels of interspersed repeats after the merge.
However, the summary file should be generated by different scripts. EDTA is using a modified version of summary script from RepeatMasker, and located EDTA/util/buildSummary.pl
, you may use this script to summarize the other two RepeatMasker results and see if they change.
Best, Shujun
Hi Shujun
I run buildSummary.pl
for one of the RepeatMasker .out
result and the output file is shown below.
GY.buildSummary.txt It has similar results.
So, if I understand correctly, the results differences between EDTA and RepeatMasker are caused by EDTA filter step. Is that right?
Bests, Yinjia
EDTA has short annotations (≤80bp) filtered. This may be the cause.
Shujun
Hi:
I got it, thanks!
Bests Yinjia
Hi
I want to annotate an animal genome. First I created a pan-TE library manually according to this article (https://doi.org/10.1093/molbev/msac080). Then I run RepeatMasker (sensitive mode) and the result is this [pic1]: But the EDTA gives me such result [pic2] : EDTA annotate 37.06% repeat sequences but RepeatMasker annotate ~43% sequences. Then I look for the RepeatMasker results contained in EDTA[AL-1.chr.final.fasta.mod.tbl,quick mode], it annotate 41% [pic3]. It seems two RepeatMasker's results get similar results, but EDTA get low percentage. I wonder what's the point? By the way, my command is
EDTA.pl --genome /data/01/user186/666..genome/AL-1.chr.final.fasta --curatedlib /data/01/user156/wyj/02.genome.TE/new_id.fa --anno 1 -t 40 -u 3.03e-9
Bests, Yinjia