nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

Funnotate can't find predict_results/genome.gff3 file #111

Closed eyalbenda closed 6 years ago

eyalbenda commented 6 years ago

I had Funannotate working great before, but for some reason now when I try to use funannotate predict it crashes right towards the end, complaining it can't find predict_results/genome.gff3. The directory predict_results doesn't actually exist at this point.

[09:05:30 AM]: OS: linux2, 8 cores, ~ 64 GB RAM. Python: 2.7.11 [09:05:31 AM]: Running funannotate v0.7.2 [09:05:31 AM]: Augustus training set for ctrp5_pasa already exists, thus funannotate will use those parameters. If you want to re-train, provide a unique name for the --augustus_species argument [09:05:32 AM]: AUGUSTUS (3.2.2) detected, version seems to be compatible with BRAKER1 and BUSCO [09:05:40 AM]: Masked genome: 98 scaffolds; 82,895,028 bp; 1.30% repeats masked [09:05:41 AM]: Using existing transcript evidence alignments [09:05:41 AM]: 19,399 transcripts aligned with GMAP [09:05:50 AM]: Using existing protein evidence alignments [09:05:50 AM]: ctrp5_pasa as already been trained, using existing parameters [09:05:50 AM]: Now launching BRAKER to train GeneMark and Augustus [09:06:02 AM]: Pulling out high quality Augustus predictions [09:06:03 AM]: Found 0 high quality predictions from Augustus (>90% exon evidence) [09:06:10 AM]: 68,647 total gene models from all sources [09:06:10 AM]: Setting up EVM partitions [09:11:04 AM]: Generating EVM command list [09:11:04 AM]: Running EVM commands with 4 CPUs [09:30:38 AM]: Combining partitioned EVM outputs [09:31:07 AM]: Converting EVM output to GFF3 [09:31:16 AM]: Collecting all EVM results [09:31:16 AM]: 22,498 total gene models from EVM [09:31:17 AM]: Predicting tRNAs [09:31:17 AM]: Merging EVM output with tRNAscan output [09:31:17 AM]: Reformatting GFF file using GAG [09:32:37 AM]: 23,070 total gene models [09:32:37 AM]: Filtering out bad gene models (< 50 aa in length, transposable elements, etc). [09:44:40 AM]: 22,566 gene models remaining [09:44:40 AM]: Converting to preliminary Genbank format [09:50:28 AM]: Cleaning models flagged by tbl2asn [09:50:44 AM]: 22,416 gene models remaining [09:50:44 AM]: Re-naming gene models [09:52:05 AM]: Converting to final Genbank format Traceback (most recent call last): File "/home/linuxbrew/.linuxbrew/Cellar/funannotate/0.7.2/libexec/bin/funannotate-predict.py", line 1137, in shutil.copyfile(os.path.join(gag3dir, 'genome.gff'), final_gff) File "/home/multivac/Discarica/Software/Anaconda2Arch/lib/python2.7/shutil.py", line 83, in copyfile with open(dst, 'wb') as fdst: IOError: [Errno 2] No such file or directory: 'ctrp5newRNAseq/predict_results/ctrp5_pasa.gff3'

nextgenusfs commented 6 years ago

What was your command?

eyalbenda commented 6 years ago
 funannotate predict -i ~/juno/tropicalis/ErikaRef/tropicalis_HiC_assembly_170823.fasta \
   -o $PWD/ctrp5newRNAseq/ -s Ctrp5_pasa --isolate JU1373 \
   --rna_bam ~/juno/tropicalis/newRNAseq/ctrp5star/final/JU1373/JU1373-ready.bam \
   --busco_seed_species celegans \
   --protein_evidence uniprot_sprot.fasta,caenorhabditis_elegans.PRJNA13758.WBPS9.protein.fa \
   --transcript_evidence ~/Discarica/Software/References/celegans/Caenorhabditis_elegans.WBcel235.cdna.all.fa \
   --pasa_gff ju1373newRNAseq.pasa_assemblies.gff3 --busco_db nematoda \
   --organism other --cpus 5 --max_intronlen 20000

I tried also with just -o ctrp5newRNAseq and it didn't work

nextgenusfs commented 6 years ago

The output folders are created right away in the script, so I don't know why the predict_results folder would not be present while the predict_misc folder would be present. If you manually create the folderpredict_results does funannotate predict then finish correctly?

eyalbenda commented 6 years ago

It worked! It might have been caused by me choosing a preexisting (and empty) output directory. I suggest you add to the script a check that the directories all exist, just a suggestion. anyway closing this for now. Thanks!

nextgenusfs commented 6 years ago

Okay, I'll put on my list to add another check. The pre-existing output is important though as that is how it re-uses data if something happens (I view as somewhat necessary as several of these steps are time-consuming). But I should be able to just build another check before it outputs any files into the predict_results directory.

nextgenusfs commented 6 years ago

Actually I just looked at code, there was this one instance (you specifying an outputdir that wasn't a previous funanntoate output dir) and the predict_results folder was not created. I fixed this, thanks for reporting.