CompSynBioLab-KoreaUniv / FunGAP

FunGAP: fungal Genome Annotation Pipeline
109 stars 33 forks source link

Test run fails at Braker step: SRR1198667/braker.gff3': No such file or directory #100

Closed NishatTamana51 closed 9 months ago

NishatTamana51 commented 9 months ago

After configuring FunGAP, I did the test run but got the following error in BRAKER step. What could be the problem?

[01-15 15:57] START: BRAKER
[Run] /home/cblast/miniconda3/envs/fungap/bin/braker.pl --fungus --softmasking --cores=24 --genome=/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/maker_out/masked_assembly.fasta.adjusted --bam=/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/hisat2_out/SRR1198667.bam --species=SRR1198667 --gff3 --AUGUSTUS_CONFIG_PATH=/home/cblast/miniconda3/envs/fungap/bin/../config --BAMTOOLS_PATH=/home/cblast/miniconda3/envs/fungap/bin --GENEMARK_PATH=/home/cblast/Documents/bioinfo_softwares/FunGAP/external/gmes_linux_64_4 --SAMTOOLS_PATH=/home/cblast/miniconda3/envs/fungap/bin --workingdir=/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667  --translation_table=1 --AUGUSTUS_BIN_PATH=/home/cblast/miniconda3/envs/fungap/bin > /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/logs/braker_SRR1198667.log 2>&1
[Run] mv /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker.gff3 /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker_SRR1198667.gff3
mv: cannot stat '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker.gff3': No such file or directory
[Run] /home/cblast/miniconda3/envs/fungap/bin/getAnnoFastaFromJoingenes.py -g /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/maker_out/masked_assembly.fasta.adjusted -o /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker_SRR1198667 -t 1 -3 /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker_SRR1198667.gff3
Traceback (most recent call last):
  File "/home/cblast/miniconda3/envs/fungap/bin/getAnnoFastaFromJoingenes.py", line 120, in <module>
    with open(args.gff3, "r") as gff3_handle:
FileNotFoundError: [Errno 2] No such file or directory: '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker_SRR1198667.gff3'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cblast/miniconda3/envs/fungap/bin/getAnnoFastaFromJoingenes.py", line 136, in <module>
    print("Error: Failed to open file " + args.gtf + "!")
TypeError: can only concatenate str (not "NoneType") to str
[Run] mv /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker_SRR1198667.aa /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker_SRR1198667.faa
mv: cannot stat '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/braker_SRR1198667.aa': No such file or directory
[ERROR] Braker failed. Check the log: /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/logs/braker_SRR1198667.log
Traceback (most recent call last):
  File "/home/cblast/Documents/bioinfo_softwares/FunGAP//fungap.py", line 839, in <module>
    main()
  File "/home/cblast/Documents/bioinfo_softwares/FunGAP//fungap.py", line 214, in main
    translation_table, d_path, logger
  File "/home/cblast/Documents/bioinfo_softwares/FunGAP//fungap.py", line 509, in run_braker
    check_call(command_args)
  File "/home/cblast/miniconda3/envs/fungap/lib/python3.7/subprocess.py", line 363, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/cblast/Documents/bioinfo_softwares/FunGAP/run_braker.py', '--masked_assembly', '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/maker_out/masked_assembly.fasta', '--bam_files', '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/hisat2_out/SRR1198667.bam', '--output_dir', '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out', '--translation_table', '1', '--log_dir', '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/logs', '--num_cores', '24', '--fungus']' returned non-zero exit status 1.
mbnmbn00 commented 9 months ago

It appears that Braker failed to run. Could you please attach the log file here?

/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/logs/braker_SRR1198667.log
NishatTamana51 commented 9 months ago

Update: The problem was with corrupted pfam database file and proteome file from sister orgs. I ran the commands several times for fetching them correctly. After that, the error vanished. Thanks for suggesting checking the braker_SRR1198667.log file. Sure. Here is the log file: #********* # WARNING: # The hints file(s) for GeneMark-EX contain less than 150 introns with multiplicity >= 10! (In total, 54 unique introns are contained. 19 have a multiplicity >= 10.) Possibly, you are trying to run braker.pl on data that does not provide sufficient multiplicity information. This will e.g. happen if you try to use introns generated from assembled RNA-Seq transcripts; or if you try to run braker.pl in epmode with mappings from proteins without sufficient hits per locus. Or if you use the example data set (orthodb_small.fa). # A low number of intron hints with sufficient multiplicity may result in a crash of GeneMark-EX (it should not crash with the example data set). #********* ln: failed to create symbolic link '/home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/traingenes.gtf': File exists ERROR in file /home/cblast/miniconda3/envs/fungap/bin/braker.pl at line 6918 failed to execute: ln -s /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/GeneMark-ET/genemark.gtf /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/fungap_out/braker_out/SRR1198667/traingenes.gtf!

Is the problem related to fetching genome assembly and protein dataset files from NCBI? I faced problem while using extracting the gz contents for the protein file. But after several attempt, It worked.

Initially. I encountered the following output while trying to download protein files for test run using download_sister_orgs.py: `Validate input taxon...

=== Taxon: Saccharomyces cerevisiae Rank: species Lineage: cellular organisms; Eukaryota; Opisthokonta; Fungi; Dikarya; Ascomycota; saccharomyceta; Saccharomycotina; Saccharomycetes; Saccharomycetales; Saccharomycetaceae; Saccharomyces

Downloading protein sequence files... [Run] wget --quiet -nc ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/949/124/435/GCA_949124435.1_AIF_collapsed.nuclear_genome.ScRAP/*protein.faa.gz [Run] mv GCA_949124435.1_AIF_collapsed.nuclear_genome.ScRAP_protein.faa.gz /home/cblast/Documents/7_f_proj/testrun_fungap_to_be_removed/sister_orgs/GCA_949124435.1_protein.faa.gz

Done. Check "sister_orgs.list" for downloaded files details`

But the sister_orgs.lists file gave me no entries for the columns: `asm_id organism genbank_acc kingdom phylum subphylum class order family file_name'

Did braker fail for this?

mbnmbn00 commented 9 months ago

Glad to hear that you resolved it! Let me know if you have further questions!