HRGV / phyloFlash

phyloFlash - A pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an illumina (meta)genomic dataset.
GNU General Public License v3.0
77 stars 25 forks source link

PhyloFlash fails while creating database #139

Closed psampara closed 2 years ago

psampara commented 3 years ago

Hello Phyloflashers! I have been trying to run PhyloFlash 3.4. However, the program phyloFlash_makedb.pl fails with error code 512. I installed PhyloFlash 3.4 by cloning the GitHub package and installed the rest of the dependencies. I am unable to install using Conda due to my cluster/server requirements. All dependencies have been installed and verified with perl phyloFlash.pl -check_env. Please let me know how to proceed with this issue, that would be a huge help!

I used the following command perl phyloFlash_makedb.pl --remote

After downloading SILVA and UniVec databases, I get the following error message:

[20:14:40] File ok [20:14:40] unpacking SILVA database [20:14:58] searching for LSU contamination in SSU RefNR [20:14:58] running subcommand: /scratch/pranavs/phyloFlash/barrnap-HGV/bin/barrnap_HGV --kingdom bac --threads 64 --evalue 1e-10 --gene lsu --reject 0.01 ./138.1/SILVA_SSU.fasta >tmp.barrnap_hits.bac.gff 2>tmp.barrnap_hits.bac.barrnap.out [20:14:59] FATAL: Tool execution failed!. Error was 'No such file or directory' and return code '512' Check log file tmp.barrnap_hits.bac.gff Check error log file tmp.barrnap_hits.bac.barrnap.out Aborting. [20:14:59] Saving log to file phyloFlash_log_on_error.

The full log is below

[20:14:58] This is barrnap_HGV 0.7 [20:14:58] Written by Torsten Seemann torsten.seemann@gmail.com [20:14:58] Obtained from https://github.com/Victorian-Bioinformatics-Consortium/barrnap [20:14:58] Detected operating system: linux [20:14:58] Using HMMER binary: /scratch/pranavs/phyloFlash/barrnap-HGV/bin/../binaries/linux/nhmmer [20:14:58] Will use 64 threads [20:14:58] Setting evalue cutoff to 1e-10 [20:14:58] Will tag genes < 0.8 of expected length. [20:14:58] Will reject genes < 0.01 of expected length. [20:14:58] Using database: /scratch/pranavs/phyloFlash/barrnap-HGV/bin/../db/lsu/bac.hmm [20:14:58] Scanning ./138.1/SILVA_SSU.fasta for lsu bac rRNA genes... please wait [20:14:58] Command: /scratch/pranavs/phyloFlash/barrnap-HGV/bin/../binaries/linux/nhmmer --cpu 64 -E 1e-10 --w_length 3878 -o /dev/null --tblout /dev/stdout \/scratch\/pranavs\/phyloFlash\/barrnap-HGV\/bin\/..\/db\/lsu\/bac.hmm .\/138.1\/SILVA_SSU.fasta sh: line 1: 17892 Aborted /scratch/pranavs/phyloFlash/barrnap-HGV/bin/../binaries/linux/nhmmer --cpu 64 -E 1e-10 --w_length 3878 -o /dev/null --tblout /dev/stdout \/scratch\/pranavs\/phyloFlash\/barrnap-HGV\/bin\/..\/db\/lsu\/bac.hmm .\/138.1\/SILVA_SSU.fasta 2>&1 [20:14:59] bad line in nhmmer output - Fatal exception (source file ../../easel/esl_sq.c, line 231):

kbseah commented 3 years ago

hello @psampara , sorry for the late response. Could you please try adding the option --CPUs 16? At the moment it is using all available processors, but I suspect that this results in too much memory being requested by nhmmer, hence the crash.

acastills commented 3 years ago

hi @kbseah , I'm running into the same issue and still got the same error using --CPUs 16 and 8. I also updated barrnap (via mamba), but no luck. Do you have other suggestions?

kbseah commented 3 years ago

hello @acastills could you please post the command you used, the log file phyloFlash_log_on_error, as well as the barrnap log file?

acastills commented 3 years ago

sure, the command was: phyloFlash_makedb.pl --silva_file arb_silva/SILVA_138.1_SSURef_NR99_12_06_20_opt.arb.gz --univec_file univec/UniVec --CPUs 16 phyloFlash_log_on_error is:

[17:42:55] Checking for required tools.
[17:42:55] Using grep found at "/bin/grep".
[17:42:55] Using bowtiebuild found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bowtie-build".
[17:42:55] Using bbmap found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bbmap.sh".
[17:42:55] Using barrnapHGV found at
       "/bioinf/home/acastill/mambaforge/envs/pf/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV".
[17:42:55] Using bbduk found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bbduk.sh".
[17:42:55] Using vsearch found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/vsearch".
[17:42:55] Using bbmask found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bbmask.sh".
[17:42:55] All required tools found.
[17:42:55] using local copy of univec: univec/UniVec
[17:42:55] using local copy of Silva SSU RefNR:
       arb_silva/SILVA_138.1_SSURef_NR99_12_06_20_opt.arb.gz
[17:42:55] unpacking SILVA database
[17:42:55] File ./138.1//SILVA_SSU.fasta exists, not overwriting
[17:42:55] searching for LSU contamination in SSU RefNR
[17:42:55] running subcommand:
       /bioinf/home/acastill/mambaforge/envs/pf/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV
         --kingdom bac --threads 16 --evalue 1e-10 --gene lsu --reject
       0.01 ./138.1/SILVA_SSU.fasta  >tmp.barrnap_hits.bac.gff
       2>tmp.barrnap_hits.bac.barrnap.out
[17:42:56] FATAL: Tool execution failed!.
       Error was 'No such file or directory' and return code '512'
       Check log file tmp.barrnap_hits.bac.gff
       Check error log file tmp.barrnap_hits.bac.barrnap.out
       Aborting.
[17:42:56] Saving log to file phyloFlash_log_on_error

and tmp.barrnap_hits.bac.barrnap.out is:

[17:42:55] Checking for required tools.
[17:42:55] Using grep found at "/bin/grep".
[17:42:55] Using bowtiebuild found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bowtie-build".
[17:42:55] Using bbmap found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bbmap.sh".
[17:42:55] Using barrnapHGV found at
       "/bioinf/home/acastill/mambaforge/envs/pf/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV".
[17:42:55] Using bbduk found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bbduk.sh".
[17:42:55] Using vsearch found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/vsearch".
[17:42:55] Using bbmask found at
       "/bioinf/home/acastill/mambaforge/envs/pf/bin/bbmask.sh".
[17:42:55] All required tools found.
[17:42:55] using local copy of univec: univec/UniVec
[17:42:55] using local copy of Silva SSU RefNR:
       arb_silva/SILVA_138.1_SSURef_NR99_12_06_20_opt.arb.gz
[17:42:55] unpacking SILVA database
[17:42:55] File ./138.1//SILVA_SSU.fasta exists, not overwriting
[17:42:55] searching for LSU contamination in SSU RefNR
[17:42:55] running subcommand:
       /bioinf/home/acastill/mambaforge/envs/pf/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV
         --kingdom bac --threads 16 --evalue 1e-10 --gene lsu --reject
       0.01 ./138.1/SILVA_SSU.fasta  >tmp.barrnap_hits.bac.gff
       2>tmp.barrnap_hits.bac.barrnap.out
[17:42:56] FATAL: Tool execution failed!.
       Error was 'No such file or directory' and return code '512'
       Check log file tmp.barrnap_hits.bac.gff
       Check error log file tmp.barrnap_hits.bac.barrnap.out
       Aborting.
[17:42:56] Saving log to file phyloFlash_log_on_error
kbseah commented 3 years ago

Thanks for supplying the log info. Looks like the SILVA input file was not a Fasta file. The correct file should be: https://www.arb-silva.de/fileadmin/silva_databases/current/Exports/SILVA_138.1_SSURef_NR99_tax_silva_trunc.fasta.gz

There are many similarly-named files on the SILVA site, so it's easy to get mixed up. We recommend using the --remote option to download automatically, but I know it's not always possible because of firewalls.

Do let me know if you still encounter problems.

acastills commented 3 years ago

Oops. It's working fine now, thank you!

kbseah commented 3 years ago

welcome!