Closed chaimol closed 10 months ago
I have check the code find the two file were empty , the reason is the EDTA results file "$genome.out.EDTA.TEanno.out" lack of "_LTR", just have "LTR".
So I modify the https://github.com/oushujun/LTR_retriever/blob/master/bin/Age_est.pl file the line48
from next unless /_LTR/;
to next unless /LTR/;
, and then I rerun the command
##从EDTA的输出结果计算LAI
genome="genome.fa" #这个是用于EDTA计算的基因组文件的名称
EDTA_path="/share/home/chai/soft/EDTA" #EDTA的安装路径
LAI="/share/home/chai/soft/LTR_retriever/LTR_retriever/LAI" #LAI的绝对路径
threads=16
for genome in ${genome}; do perl ${EDTA_path}/util/gff2bed.pl $genome.mod.EDTA.TEanno.gff3 structural > $genome.mod.EDTA.TEanno.struc.bed ; done
for genome in ${genome}; do grep LTR $genome.mod.EDTA.TEanno.struc.bed|grep struc|awk '{print $1":"$2".."$3"\t"$7}' > $genome.mod.EDTA.TEanno.LTR.pass.list ; done
for genome in ${genome}; do perl -nle 'my ($chr, $s, $e, $anno, $dir, $supfam)=(split)[0,1,2,3,8,12]; print "10000 0.001 0.001 0.001 $chr $s $e NA $dir $anno $supfam"' $genome.mod.EDTA.TEanno.struc.bed > $genome.out.EDTA.TEanno.out ; done
for genome in ${genome}; do ${LAI} -genome $genome -intact $genome.mod.EDTA.TEanno.LTR.pass.list -all $genome.out.EDTA.TEanno.out -q -t ${threads} ; done
The output info as follow:
######################################
### LTR Assembly Index (LAI) beta3.2 ###
######################################
Developer: Shujun Ou
Please cite:
Ou S., Chen J. and Jiang N. (2018). Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. gky730: https://doi.org/10.1093/nar/gky730
Parameters: -genome genome.fa -intact genome.fa.mod.EDTA.TEanno.LTR.pass.list -all genome.fa.out.EDTA.TEanno.out -q -t 16
Tue Sep 5 10:40:41 CST 2023 Dependency checking: Passed!
Tue Sep 5 10:40:41 CST 2023 Calculation of LAI will be based on the whole genome.
Please use the -mono parameter if your genome is a recent ployploid, otherwise high identity between LTR homeologues will overcorrect raw LAI scores and result in low LAI.
Tue Sep 5 10:40:41 CST 2023 Estimate the identity of LTR sequences in the genome: quick mode
Tue Sep 5 10:42:21 CST 2023 The identity of LTR sequences: 95.0119673202615%
Tue Sep 5 10:42:21 CST 2023 Calculate LAI:
Done!
Tue Sep 5 10:42:24 CST 2023 Result file: genome.fa.out.EDTA.TEanno.out.LAI
You may use either raw_LAI or LAI for intraspecific comparison
but please use ONLY LAI for interspecific comparison
but I check the LAI file 'head genome.fa.out.EDTA.TEanno.out.LAI'
whole_genome 1 118716139 0.0740 0.0740 100.00 97.15
A01 1 3000000 0.0350 0.0350 100.00 97.15
A01 300001 3300000 0.0368 0.0368 100.00 97.15
A01 600001 3600000 0.0452 0.0452 100.00 97.15
A01 900001 3900000 0.0454 0.0454 100.00 97.15
the LAI value still were too large.
Please don't submit duplicated questions. Please check out #138.
But the LAI value is big more than 100. I run the command like this:
In the end ,The output info as follow:
The output file
genome.fa.out.EDTA.TEanno.out.LAI.LTR.fa
andgenome.fa.out.EDTA.TEanno.out.LAI.LTRlist
were empty. and the LAI file LAI value is more than 100.head genome.fa.out.EDTA.TEanno.out.LAI