oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
315 stars 70 forks source link

solve '*.mod.EDTA.TEanno.sum' empty #460

Open ZJin2021 opened 1 month ago

ZJin2021 commented 1 month ago

Hi shujun! I have some genome mod.EDTA.TEanno.sum files that are empty, so I checked part of the EDTA process. I found that buildSummary.pl script at line 560 will terminate when reading the .mod.EDTA.TEanno.out file if $type is missing. I checked the .mod.EDTA.TEanno.split.bed file, and the repeat sequence missing the $type is marked as "snRNA". I don't know if there are other types of repeat sequences that may lack the $type tag. What about changing the "die" at line 560 of the buildSummary.pl script to "next"? I am not sure if this will affect subsequent processes.

perl EDTA-master/util/buildSummary.pl -maxDiv 40 -stats $genome.mod.stats $genome.mod.EDTA.TEanno.out > $genome.mod.EDTA.TEanno.sum 2> out.log

out.log

This out line is the first instance of the change:
10000 0.001 0.001 0.001 scaffold398 121760 121980 NA + TE_00000280_INT LTR/unknown
missing type for TE_00002102 ... <>

``mod.EDTA.TEanno.out

10000 0.001 0.001 0.001 Chr4 13998493 14000579 NA + TE_00001932_INT LTR/unknown
10000 0.001 0.001 0.001 Chr4 14000611 14000691 NA + TE_00002102 
10000 0.001 0.001 0.001 Chr4 14000769 14001580 NA + TE_00001932_INT LTR/unknown

.mod.EDTA.TEanno.split.bed

Chr4    13997694        13998492        TE_00000112_INT LTR/Copia       homology        0.709   4196    -       .       ID=TE_homo_82343;sequence_ontology=SO:0002264;ID=TE_homo_89176;sequence_ontology=SO:0002264
Chr4    13998493        14000579        TE_00001932_INT LTR/unknown     homology        0.9     9217    +       .       ID=TE_homo_82342;sequence_ontology=SO:0000186;ID=TE_homo_89175;sequence_ontology=SO:0000186
Chr4    14000611        14000691        TE_00002102     snRNA   homology        0.839   458     +       .       ID=TE_homo_82344;sequence_ontology=SO:0000274;ID=TE_homo_89177;sequence_ontology=SO:0000274     
yanyew commented 1 month ago

I met the same issue too. Have you ever fixed it?

ZJin2021 commented 1 month ago

I met the same issue too. Have you ever fixed it?

I modified the code on line 560 of script util/buildSummary.pl, and now it can be used normally.

die "missing type for $id ... <$type>\n" if ( !$type );

to

if ( !$type ){
     next;
  }
yanyew commented 1 month ago

I met the same issue too. Have you ever fixed it?

I modified the code on line 560 of script util/buildSummary.pl, and now it can be used normally.

die "missing type for $id ... <$type>\n" if ( !$type );

to

if ( !$type ){
     next;
  }

Thank you!