Closed ignadb closed 4 years ago
I think there is now trnascan2 version out. Try to run the cmd in the logfile manually and might give more informative error. But doesn’t seem like a funannotate issue as it was previously working correct?
I am upgrading to 2.0 and running it now. Will get back to you when I know more. However, I feel like the prediction and filtering were done before tRNAScan was called, so the predicted genes should be fine. I attached funannotate trace in case you want to confirm my thought.
[12:45 PM]: OS: linux2, 32 cores, ~ 132 GB RAM. Python: 2.7.12 [12:45 PM]: Running funannotate v1.5.2-30c1166 [12:45 PM]: Augustus training set for hymenoscyphus_koreanus_f52847-1 already exists. To re-train provide unique --augustus_species argument [12:45 PM]: AUGUSTUS (3.2.3) detected, version seems to be compatible with BRAKER and BUSCO [12:46 PM]: Loading genome assembly and parsing soft-masked repetitive sequences [12:46 PM]: Genome loaded: 1,792 scaffolds; 57,382,120 bp; 19.68% repeats masked [12:46 PM]: Aligning transcript evidence to genome with minimap2 [12:46 PM]: Found 25,601 alignments, wrote GFF3 and Augustus hints to file [12:46 PM]: Mapping proteins to genome using Diamond blastx/Exonerate [12:46 PM]: Using 544,324 proteins as queries [12:46 PM]: Running Diamond pre-filter search [12:54 PM]: Found 505,527 preliminary alignments [03:07 PM]: Exonerate finished: found 1,382 alignments [03:07 PM]: Running GeneMark-ES on assembly [03:47 PM]: Converting GeneMark GTF file to GFF3 [03:48 PM]: Found 14,502 gene models [03:48 PM]: Running Augustus gene prediction [04:05 PM]: Found 11,724 gene models [04:05 PM]: Pulling out high quality Augustus predictions [04:05 PM]: Found 1,644 high quality predictions from Augustus (>90% exon evidence) [04:05 PM]: Summary of gene models passed to EVM (weights):
Augustus models (1): 10,080 Genemark models (1): 14,502 HiQ models (2): 1,644 Pasa models (1): 0 Total models: 26,226
[04:05 PM]: Setting up EVM partitions [04:05 PM]: Generating EVM command list [04:05 PM]: Running EVM commands with 3 CPUs [04:32 PM]: Combining partitioned EVM outputs [04:32 PM]: Converting EVM output to GFF3 [04:34 PM]: Collecting all EVM results [04:34 PM]: 14,240 total gene models from EVM [04:34 PM]: Generating protein fasta files from 14,240 EVM models [04:35 PM]: now filtering out bad gene models (< 50 aa in length, transposable elements, etc). [04:35 PM]: Found 328 gene models to remove: 3 too short; 0 span gaps; 421 transposable elements [04:35 PM]: 13,912 gene models remaining [04:35 PM]: Predicting tRNAs [04:37 PM]: Found 128 tRNA gene models [04:37 PM]: 128 tRNAscan models are valid (non-overlapping) [04:37 PM]: Generating GenBank tbl annotation file [04:37 PM]: Converting to final Genbank format [04:39 PM]: Collecting final annotation files for 14,040 total gene models [04:39 PM]: Funannotate predict is finished, output files are in the /home/chatchai/Desktop/F5-1/funannotate152.F5-1.new.3/predict_results folder
Yes tRNA prediction is not part of the EVM proteins coding methods so it has no effect on the rest of the gene models.
Hi Jon,
Thanks for your continuous support of Funannotate! It is very useful!
I ran funannotate 1.5.2 with this command:
/home/chatchai/software/funanno2018/funannotate151/bin/funannotate-predict.py -i /home/chatchai/Desktop/F5-1/F5-1_scaffolds.minimap2.minlen500.sorted.masked.fasta -o /home/chatchai/Desktop/F5-1/funannotate152.F5-1.new.3 --species Hymenoscyphus koreanus --isolate F52847-1 --transcript_evidence MFB1_PFB1_concatenated.fasta --cpu 4
and it went well until tRNAScan, where the following message arose for all scaffolds.
tRNAscan1.4: main cannot open TPCsignal consensus file tRNAscan could not complete successfully for scaffold_1. Possible memory allocation problem or missing file. (Exit code=256).
Do you have an idea how to fix this? And I assume that this problem doesn't interfere with the gene models that are predicted and filtered by EVM before?
Thanks very much and have a great day!
Best regards, Chatchai