HRGV / phyloFlash

phyloFlash - A pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an illumina (meta)genomic dataset.
GNU General Public License v3.0
77 stars 25 forks source link

error when running phyloflash with barrnap (FATAL: Tool execution failed!.) #142

Closed dluecking closed 3 years ago

dluecking commented 3 years ago

Hey, great work with PhyloFlash, I've used it multiple times in the past and was very happy with the output.

However, I ran the following command and I got stuck with an error.

perl ~/bin/phyloFlash/phyloFlash.pl -lib tyrell -CPUs 20 -read1 4975_tyrell_R1_trimmed.fq.gz -read2 4975_tyrell_R2_trimmed.fq.gz 

The tail of the error message:

[15:26:43] done...
[15:26:43] getting SSU phylotypes and their coverages...
[15:26:43] no phylotypes assembled with SPAdes
[15:26:43] running subcommand:
       /home/dlueckin/bin/phyloFlash/barrnap-HGV/bin/barrnap_HGV 
       --evalue 1e-100 --reject 0.6 --kingdom bac --gene ssu --threads
       20 tyrell.spades/scaffolds.fasta >tyrell.scaffolds.bac.gff
       2>tyrell.barrnap.out
[15:26:44] FATAL: Tool execution failed!.
       Error was 'No such file or directory' and return code '512'
       Check log file tyrell.scaffolds.bac.gff
       Check error log file tyrell.barrnap.out
       Aborting.
[15:26:44] Saving log to file phyloFlash_log_on_error

The referenced tyrell.barrnap.out file:

[15:26:44] This is barrnap_HGV 0.7
[15:26:44] Written by Torsten Seemann <torsten.seemann@gmail.com>
[15:26:44] Obtained from https://github.com/Victorian-Bioinformatics-Consortium/barrnap
[15:26:44] Detected operating system: linux
[15:26:44] Using HMMER binary: /home/dlueckin/bin/phyloFlash/barrnap-HGV/bin/../binaries/linux/nhmmer
[15:26:44] Will use 20 threads
[15:26:44] Setting evalue cutoff to 1e-100
[15:26:44] Will tag genes  < 0.8 of expected length.
[15:26:44] Will reject genes < 0.6 of expected length.
[15:26:44] Using database: /home/dlueckin/bin/phyloFlash/barrnap-HGV/bin/../db/ssu/bac.hmm
[15:26:44] Usage: barrnap_HGV <file.fasta>

Anything I'm doing particularly wrong? Thanks in advance!

kbseah commented 3 years ago

Hello @dluecking , could you comment which version of phyloFlash you are using? I think this is a bug that we may have missed but want to be sure that i'm looking at the same version of code as you are running.

It looks like there were no full-length sequences assembled by SPAdes. in the meanwhile as a workaround you could use the option -skip_spades just for this run so you can at least get the other outputs.

Sorry for the trouble!

dluecking commented 3 years ago

Hello @kbseah ,

thanks for quick reply! I'm running version 3.4.

I'll try a run with the suggested -skip_spades flag.

No worries and thanks again for the help! Let me know if you need more info about my setup.

dluecking commented 3 years ago

I ran into another error with the -skip_spades flag active.

...
[09:56:05] mapping rate: 0.017%
[09:56:05] writing final files...
[09:56:05] exporting results to csv
[09:56:05] generating graphics for report in SVG format
[09:56:05] Plotting mapping ID histogram
[09:56:05] running subcommand:
       /home/dlueckin/bin/phyloFlash/phyloFlash_plotscript_svg.pl
       --hist tyrell.idhistogram  --title="Mapping identity (%)" 
       >tyrell.plotscript.out 2>&1
[09:56:05] Plotting piechart of mapping ratios
[09:56:05] running subcommand:
       /home/dlueckin/bin/phyloFlash/phyloFlash_plotscript_svg.pl -pie
       tyrell.mapratio.csv -title="0.017 % pairs mapped"
       >tyrell.plotscript.out 2>&1
[09:56:05] Plotting histogram of insert sizes
[09:56:05] running subcommand:
       /home/dlueckin/bin/phyloFlash/phyloFlash_plotscript_svg.pl
       --hist  tyrell.inserthistogram  --title="Insert size (bp)" 
       >tyrell.plotscript.out 2>&1
[09:56:05] FATAL: Tool execution failed!.
       Error was 'Inappropriate ioctl for device' and return code
       '6400'
       Check log file tyrell.plotscript.out
       Check error log file &1
       Aborting.

I think it might be down to the way I installed phyloFlash: I had trouble installing it via conda (it being unable to resolve the environment), so I installed the dependencies via conda and manually installed pf by cloning from github. Since I'm behind a proxy, I downloaded the DBs manually. This might be the root of the problem? I don't know.

kbseah commented 3 years ago

Could you please post the contents of the log file tyrell.plotscript.out, and the file size of tyrell.inserthistogram ? If the files don't exist, use the option -keeptmp to not delete temp and log files for troubleshooting

dluecking commented 3 years ago

Sure thing. tyreel.plotscript.out:

Can't take log of 0 at /home/dlueckin/bin/phyloFlash/phyloFlash_plotscript_svg.pl line 876.

Size of tyrell.inserthistogram is 1.4 kB.

kbseah commented 3 years ago

hm that's a new one.. could you post the contents of tyrell.inserthistogram here?

dluecking commented 3 years ago

tyrell.idhistogram.txt

Attached is tyrell.idhistogram. I renamed it (adding txt), so I can upload instead of pasting the long content.

kbseah commented 3 years ago

Thanks for sharing the file. Unfortunately I wasn't able to reproduce the error.

The command I tried was

phyloFlash_plotscript_svg.pl --hist tyrell.idhistogram.txt -title="Mapping identity (%)"

The expected output is a file tyrell.idhistogram.txt.svg, which worked for me (I renamed it to .txt too to attach):

tyrell.idhistogram.txt.svg.txt

I'm sorry that I can't figure out what's going on here.

One thing to try would be to reinstall again with Conda. Environment solving problems are a long-standing issue with Conda. Some things that could help that often work for me are either (a) use the --strict-channel-priority option when creating a new environment (also recommended by the Conda developers, or (b) install mamba to your base environment as a drop-in substitute for the Conda environment solver (on top of a standard Conda install), so all you have to change is type mamba create instead of conda create (Mamba is also now default in the latest snakemake).

Hope that this helps

dluecking commented 3 years ago

Thanks @kbseah , I will see what I can do with a new install. Thanks for helping out. Should I close the issue?

kbseah commented 3 years ago

Please keep the issue open for now, so I'll have a reminder to fix the first bug you reported ;)

guangliangtimo commented 2 years ago

Hey, I use phyloflash to profile the community composition using metagenome data. However, I encountered the following problem when I run "phyloFlash_makedb.pl --remote":

[09:19:49] Checking for required tools. [09:19:49] Using vsearch found at "/nfs/home/9402_zhangguangliang/.conda/envs/r/bin/vsearch". [09:19:49] Using bbmask found at "/nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bbmask.sh". [09:19:49] Using bbduk found at "/nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bbduk.sh". [09:19:49] Using bowtiebuild found at "/nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bowtie-build". [09:19:49] Using barrnapHGV found at "/nfs/home/9402_zhangguangliang/.conda/envs/r/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV". [09:19:49] Using grep found at "/bin/grep". [09:19:49] Using bbmap found at "/nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bbmap.sh". [09:19:49] All required tools found. [09:19:49] downloading latest univec from ncbi [09:19:49] Connecting to ftp.ncbi.nlm.nih.gov [09:19:50] Finding /pub/UniVec/UniVec [09:19:52] Found UniVec (1701925 bytes) [09:19:56] downloading latest SSU RefNR from www.arb-silva.de [09:19:56] Connecting to ftp.arb-silva.de [09:19:57] Finding /current/Exports/*_SSURef_N?99_tax_silva_trunc.fasta.gz [09:19:59] Found SILVA_138.1_SSURef_NR99_tax_silva_trunc.fasta.gz (195410064 bytes) [09:20:01] The file you are about to download comes with a license:

   As of release 138 the SILVA databases, its taxonomy, and all
   files provided for
   download are licensed unter Creative Commons Attribution 4.0
   (CC-BY 4.0).

   All data is freely available for academic and commercial use as
   long as SILVA
   is credited as original author and a link to the full license is
   provided.

   The full license is available:
       https://creativecommons.org/licenses/by/4.0/ and
       https://creativecommons.org/licenses/by/4.0/legalcode

[09:20:01] Do you wish to continue downloading under the conditions [09:20:01] specified above? [yes/no]: [09:20:21] Verifying MD5... [09:20:21] File ok [09:20:21] unpacking SILVA database [09:20:32] searching for LSU contamination in SSU RefNR [09:20:32] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV --kingdom bac --threads 256 --evalue 1e-10 --gene lsu --reject 0.01 ./138.1/SILVA_SSU.fasta >tmp.barrnap_hits.bac.gff 2>tmp.barrnap_hits.bac.barrnap.out [09:20:59] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV --kingdom arch --threads 256 --evalue 1e-10 --gene lsu --reject 0.01 ./138.1/SILVA_SSU.fasta

tmp.barrnap_hits.arch.gff 2>tmp.barrnap_hits.arch.barrnap.out [09:21:27] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV --kingdom euk --threads 256 --evalue 1e-10 --gene lsu --reject 0.01 ./138.1/SILVA_SSU.fasta >tmp.barrnap_hits.euk.gff 2>tmp.barrnap_hits.euk.barrnap.out [09:22:03] Removing sequences with potential LSU contamination [09:22:03] Number of sequences to skip: 120 [09:22:08] masking low entropy regions in SSU RefNR [09:22:08] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bbmask.sh
overwrite=t -Xmx10g threads=256 in=./138.1//SILVA_SSU.noLSU.fasta out=./138.1//SILVA_SSU.noLSU.masked.fasta minkr=4 maxkr=8 mr=t minlen=20 minke=4 maxke=8 fastawrap=0 2>tmp.bbmask_mask_repeats.log [09:22:33] removing UniVec contamination in SSU RefNR [09:22:33] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bbduk.sh ref=UniVec overwrite=t -Xmx10g threads=256 fastawrap=0 ktrim=r ow=t minlength=800 mink=11 hdist=1 in=./138.1//SILVA_SSU.noLSU.masked.fasta out=./138.1//SILVA_SSU.noLSU.masked.trimmed.fasta stats=./138.1//SILVA_SSU.noLSU.masked.trimmed.fasta.UniVec_contamination_stats.txt 2>tmp.bbduk_remove_univec.log [09:22:50] Vsearch v2.5.0+ found, will index database to UDB file [09:22:50] Indexing ./138.1//SILVA_SSU.noLSU.masked.trimmed.fasta to make UDB file ./138.1//SILVA_SSU.noLSU.masked.trimmed.udb with Vsearch [09:22:50] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/bin/vsearch --threads 256 --notrunclabels --makeudb_usearch ./138.1//SILVA_SSU.noLSU.masked.trimmed.fasta --output ./138.1//SILVA_SSU.noLSU.masked.trimmed.udb 2>tmp.vsearch_make_udb.log [09:25:45] clustering database [09:25:45] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/bin/vsearch
--cluster_fast ./138.1/SILVA_SSU.noLSU.masked.trimmed.fasta --id 0.99 --centroids ./138.1/SILVA_SSU.noLSU.masked.trimmed.NR99.fasta --notrunclabels --threads 256 [10:02:12] creating bbmap reference [10:02:12] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bbmap.sh
-Xmx10g threads=256 ref=./138.1/SILVA_SSU.noLSU.masked.trimmed.NR99.fixed.fasta path=./138.1/ 2>tmp.bbmap_index.log [10:02:52] clustering database [10:02:52] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/bin/vsearch
--cluster_fast ./138.1/SILVA_SSU.noLSU.masked.trimmed.NR99.fasta --id 0.96 --centroids ./138.1/SILVA_SSU.noLSU.masked.trimmed.NR96.fasta --notrunclabels --threads 256 [10:19:38] creating bowtie index (for emirge) [10:19:38] running subcommand: /nfs/home/9402_zhangguangliang/.conda/envs/r/bin/bowtie-build ./138.1/SILVA_SSU.noLSU.masked.trimmed.NR96.fixed.fasta ./138.1/SILVA_SSU.noLSU.masked.trimmed.NR96.fixed.bt -q 2>tmp.bowtiebuild.log [10:19:38] FATAL: Tool execution failed!. Error was '' and return code '32512' Check error log file tmp.bowtiebuild.log Aborting. [10:19:38] Saving log to file phyloFlash_log_on_error

What should I do? I have run this command several times. Looking forward to your answer.

kbseah commented 2 years ago

Hello @guangliangtimo, could you please open this in a separate issue, and attach the log file tmp.bowtiebuild.log?

By the way, if you try repeating the database build command, you don't have to download the files a second time, but you can just specify the paths to the downloaded files as described in section 4.2 here http://hrgv.github.io/phyloFlash/install.html

guangliangtimo commented 2 years ago

Dear Brandon Seah,

Very thanks for your attention, I have solved this problem, I reinstalled the PhyoFlash.

PhyloFlash is a very useful tool for me, very thanks.

Guangliang Zhang

Center for Biological Science and Technology, Advanced Institute of Natural Sciences

Beijing Normal University at Zhuhai

At 2022-04-19 18:49:55, "Brandon Seah" @.***> wrote:

Hello @guangliangtimo, could you please open this in a separate issue, and attach the log file tmp.bowtiebuild.log?

By the way, if you try repeating the database build command, you don't have to download the files a second time, but you can just specify the paths to the downloaded files as described in section 4.2 here http://hrgv.github.io/phyloFlash/install.html

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

guangliangtimo commented 2 years ago

Dear Brandon Seah,

Very thanks for your attention, I have solved this problem, I reinstalled the PhyoFlash. PhyloFlash is a very useful tool for me, very thanks.

Guangliang Zhang Center for Biological Science and Technology, Advanced Institute of Natural Sciences Beijing Normal University at Zhuhai

At 2022-04-19 18:49:55, "Brandon Seah" @.***> wrote:

Hello @guangliangtimo, could you please open this in a separate issue, and attach the log file tmp.bowtiebuild.log?

By the way, if you try repeating the database build command, you don't have to download the files a second time, but you can just specify the paths to the downloaded files as described in section 4.2 here http://hrgv.github.io/phyloFlash/install.html

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>