Open fritzthm opened 3 years ago
Hi Fritz, I am also getting this exact command error. Oddly, when I run the test data it works, but when I run my fastq file it fails. Were you able to find a solution?
Dear Juliane, NanoClust is running now on my system. I made some changes in the main.nf file which I don't remember yet. In the meantime my system crashed and I reinstalled everything new (I'm now using Scientific Linux which is basically CentOS 7). Now everything works fine without any changes. Good luck, Fritz
Von: JulianeLiberto @.> An: genomicsITER/NanoCLUST @.> Kopie: fritzthm @.>, Author @.> Gesendet: 27.08.2021 21:31 Betreff: Re: [genomicsITER/NanoCLUST] Error executing process > consensus_classification (#43)
Hi Fritz, I am also getting this exact command error. Oddly, when I run the test data it works, but when I run my fastq file it fails. Were you able to find a solution? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Thanks for getting back to me Fritz! I eventually found a workaround by moving the db folders to a new path. I don't know why that worked, since I triple-checked the original path, but it did.
Kindly, Juliane
On Wed, Sep 1, 2021 at 6:02 AM fritzthm @.***> wrote:
Dear Juliane, NanoClust is running now on my system. I made some changes in the main.nf file which I don't remember yet. In the meantime my system crashed and I reinstalled everything new (I'm now using Scientific Linux which is basically CentOS 7). Now everything works fine without any changes. Good luck, Fritz
Von: JulianeLiberto @.> An: genomicsITER/NanoCLUST @.> Kopie: fritzthm @.>, Author @.> Gesendet: 27.08.2021 21:31 Betreff: Re: [genomicsITER/NanoCLUST] Error executing process > consensus_classification (#43)
Hi Fritz, I am also getting this exact command error. Oddly, when I run the test data it works, but when I run my fastq file it fails. Were you able to find a solution? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/genomicsITER/NanoCLUST/issues/43#issuecomment-910130681, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVMCGU6LL3BVL7YL54GKFFDT7X22XANCNFSM47PVP2LQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
--
Juliane Liberto
I had the same problem and found that the main.nf file puts "/tmp/" in front of your database file paths so if the databases are in any other folder than /tmp then the blast executable will not find them. To remediate it you can change the main.nf file in lines 434 and 435 so those lines read "db= params.db" and "taxdb= params.tax" (remove the blast_dir argument) and then you can use any other file path for the database locations
Thank you tjalfdeboer. That is super helpful!
Hi all,
I'm having the same error message but with exit code 255
Executing this command:
$ nextflow run main.nf --reads 'data/fichero_concatenado_02032022.fastq' --db 'db/16S_ribosomal_RNA' --tax 'db/taxdb' -profile docker
Run Name : cranky_northcutt
Reads : data/fichero_concatenado_02032022.fastq
Max Resources : 128 GB memory, 16 cpus, 10d time per job
Container : docker - [:]
Output dir : ./results
Launch dir : /home/fernando/NanoCLUST
Working dir : /home/fernando/NanoCLUST/work
Script dir : /home/fernando/NanoCLUST
User : fernando
Config Profile : docker
----------------------------------------------------
executor > local (31)
[10/898d50] process > QC (1) [100%] 1 of 1 ✔
[3d/80210b] process > fastqc (1) [100%] 1 of 1 ✔
executor > local (31)
[10/898d50] process > QC (1) [100%] 1 of 1 ✔
[3d/80210b] process > fastqc (1) [100%] 1 of 1 ✔
[d3/306646] process > kmer_freqs (1) [100%] 1 of 1 ✔
[28/0eecf7] process > read_clustering (1) [100%] 1 of 1 ✔
[d9/7c54bd] process > split_by_cluster (1) [100%] 1 of 1 ✔
[d2/cd1dd7] process > read_correction (2) [100%] 3 of 3 ✔
[ae/0f7261] process > draft_selection (3) [100%] 3 of 3 ✔
[09/7e330c] process > racon_pass (3) [100%] 3 of 3 ✔
[1e/95b1ae] process > medaka_pass (3) [100%] 3 of 3 ✔
[1f/817d22] process > consensus_classification (2) [100%] 12 of 12, failed: 11, retries: 10
[- ] process > join_results -
[- ] process > get_abundances -
[- ] process > plot_abundances -
[7e/753279] process > output_documentation [100%] 1 of 1 ✔
[56/b64597] NOTE: Process `consensus_classification (1)` terminated with an error exit status (255) -- Execution is retried (1)
[1a/695ac0] NOTE: Process `consensus_classification (2)` terminated with an error exit status (255) -- Execution is retried (1)
[57/5b6f32] NOTE: Process `consensus_classification (1)` terminated with an error exit status (255) -- Execution is retried (2)
[ff/0c2cd6] NOTE: Process `consensus_classification (2)` terminated with an error exit status (255) -- Execution is retried (2)
[9a/0706e0] NOTE: Process `consensus_classification (1)` terminated with an error exit status (255) -- Execution is retried (3)
[e3/87a0e8] NOTE: Process `consensus_classification (2)` terminated with an error exit status (255) -- Execution is retried (3)
[ec/38fca3] NOTE: Process `consensus_classification (1)` terminated with an error exit status (255) -- Execution is retried (4)
[8f/5bfd7e] NOTE: Process `consensus_classification (2)` terminated with an error exit status (255) -- Execution is retried (4)
[73/9fa4f3] NOTE: Process `consensus_classification (1)` terminated with an error exit status (255) -- Execution is retried (5)
[b5/06d84a] NOTE: Process `consensus_classification (2)` terminated with an error exit status (255) -- Execution is retried (5)
Error executing process > 'consensus_classification (1)'
Caused by:
Process `consensus_classification (1)` terminated with an error exit status (255)
Command executed:
export BLASTDB=
export BLASTDB=$BLASTDB:/tmp/db/taxdb/
blastn -query consensus.fasta -db /tmp/db/16S_ribosomal_RNA -task blastn -dust no -outfmt "10 sscinames staxids evalue length pident" -evalue 11 -max_hsps 50 -max_target_seqs 5 | sed 's/,/;/g' > consensus_classification.csv
#DECIDE FINAL CLASSIFFICATION
cat 2_draft.log > 2_blast.log
echo -n ";" >> 2_blast.log
BLAST_OUT=$(cut -d";" -f1,2,4,5 consensus_classification.csv | head -n1)
echo $BLAST_OUT >> 2_blast.log
Command exit status:
255
Command output:
(empty)
Command error:
Error: NCBI C++ Exception:
T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/serial/objistrasnb.cpp", line 499: Error: (CSerialException::eOverflow) byte 98: overflow error ( at [].[].gi)
T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/serial/member.cpp", line 768: Error: (CSerialException::eOverflow) ncbi::CMemberInfoFunctions::ReadWithSetFlagMember() - error while reading seqid ( at Blast-def-line-set.[].[].seqid.[].[].gi)
The pipeline works perfectly with some of the sequences in the fastq but, when I add the final sequences from my samples to my .fastq, I get this error.
I've tried to modify main.nf file as @tjalfdeboer suggested, but I can't either get my final result.
Does anyone have had this problem and can help me to solve it.
Thanks a lot in advance
Hi fernanarr, I'm running into the same problem with my datasets. Did you find a workaround already? Many thanks, Robert
Hi @niederro, not yet. We are still looking for it.
Hello -- I am also stuck at the consensus_classification step. If I understand the output, the percentage shown doesn't mean much. It looks to me that 100% of the blastn processes fail.
The error message is: NOTE: Process consensus_classification (###)
terminated with an error exit status (2) -- Execution is retried (1). Is this error from blastn, from docker or from Nextflow?
Scanning some of the consensus_classification.csv files shows them to be empty.
The dataset I'm using to test was downloaded from SRA. Based on the associated publication, the data was analyzed using QIIME2 then used vsearch compared to the SILVA database.
Hi Fernanarr -- I've been getting the same Blast error. After manually running the blast command using the consensus.fasta file created for a few of the 16 fails, I can create a "#_blast.log" file from the "#_draft.log" file. Unfortunately, resuming the workflow still results in 16 fails and the run stops without moving to the next process.
Testing a few other things changes the total number of consensus.fasta files (ranging from 505 to 518) but the workflow stops at 16 fails plus 5 retries each.
Would it possible to add a line to store the fails in a file and move on in the process?
Hi DanBeaton, I think you are completely right. For most of my sequences, the blast delivered for the majority of files results, however, for some it didn't, which potentially caused the failure. Did you manage to find a workaround already? Best, Robert
Hi @niederro -- I was able to get past the error. I have the blast executable already installed on my computer (version 2.11.0+-x64-macos) and created the 16S database directly from the executable.
Instead of pointing the nextflow run to the database created as part of the NanoCLUST setup, I point to this database. With this change, the run got past the error and move to the next parts of the process. If it had not gotten past the error, I was going to try updating the executable to the current version and re-create the database.
I find myself wondering if there is a mismatch between the database in the ftp instructions and the blast version in the environment, or that the ftp'd database is somehow corrupt. Both seem to be unlikely given that some pass and some fail.
Hi @DanBeaton, your solution is very nice. However, I do not understand your "created the 16S database directly from the executable file". How did you do this? Download all the Fasta files for the 16S rRNA, and then Did you create the 16S rRNA database with the Blast executable makeblastdb?
Hi @kazubado33, I followed the instructions from the link ... https://www.ncbi.nlm.nih.gov/books/NBK52640/ ... to install and configure blast.
Also included in the instruction is how to download databases into a directory. The 16S_ribosomal_RNA database is provided as the example ---> perl ../bin/update_blastdb.pl --passive --decompress 16S_ribosomal_RNA.
This link ... https://www.ncbi.nlm.nih.gov/books/NBK569850/ ... provides instructions on viewing a list of the other database.
Hi @DanBeaton, I exactly did the same but Nanoclust stopped with the error of a non indexed database. Using this command : perl ../bin/update_blastdb.pl --passive --decompress 16S_ribosomal_RNA produced the exact same files as part of the Nanoclust wf and these seem to produce the other error. Any idea what is wrong or what did you do differently? Thanks in advance for any help.
@DanBeaton did you also change something in the main.nf file?
Hi Robert @niederro. To answer your question -- No I did not make changes to the main.nf file.
When running via docker on my Mac, with a blast database created via blast V2.11.0, the updated database works great.
When I installed and ran the same data files on a linux compute cluster, which meant using conda instead of docker or other options, and which has the most current blast, V2.13.0, plus a database created from this version, the consensus_classification error returned, but only for one of the files, so I could isolate it and try a few things.
With the help of one of the compute cluster managers -- who did make changes to the the main.nf file by adding echo statements to the consensus_classification process (output provided below) -- the error in the consensus_classification process was isolated to the blastn command ---- as the error message points to!
I then tried running without stating a databases, so that the --remote option was used. All but 7 of the consenus.fasta files failed.
On viewing the consensus_classification conda environment (in the conda_env folder), the listed blast version is 2.10.1. The most current version of blast in bioconda is 2.12.0. So I changed the blast version in the environment.yaml file from blast 2.10.1 to blast 2.12.0 and re-ran the file. The process went to completion with no errors.
So, I think the outcome is: on my Mac with blast 2.11.0, the docker's blast version used in the process 'corresponds' to the version used to create the blastdb. While on the linux compute cluster, the blast version used in the process now also 'corresponds' to the version used to create the blastdb.
I guess the take home message is to match the blast version used to create the blast db with the blast version used in the process.
I hope this helps :-)
################################################################# Error executing process > 'consensus_classification (21)'
Caused by:
Process consensus_classification (21)
terminated with an error exit status (255)
Command executed:
export BLASTDB= echo "2" export BLASTDB=$BLASTDB:/athena/home/beatond/ncbi/blastdb/ echo "BLASTDB:" $BLASTDB echo "taxdb: " /athena/home/beatond/ncbi/blastdb/ which blastn echo "2a" blastn -query consensus.fasta -db /athena/home/beatond/ncbi/blastdb/16S_ribosomal_RNA -task blastn -dust no -outfmt "10 sscinames staxids evalue length pident" -evalue 11 -max_hsps 50 -max_target_seqs 5 | sed 's/,/;/g' > consensus_classification.csv
echo "3" pwd cat 101_draft.log > 101_blast.log echo "4" echo -n ";" >> 101_blast.log echo "5" BLAST_OUT=$(cut -d";" -f1,2,4,5 consensus_classification.csv | head -n1) echo "6" echo $BLAST_OUT >> 101_blast.log echo "7"
Command exit status: 255
Command output: 2 BLASTDB: :/athena/home/beatond/ncbi/blastdb/ taxdb: /athena/home/beatond/ncbi/blastdb/ /athena/home/beatond/Tools/work/conda/consensus_classification-8c200bed21bbad7a3d574f93a7a33902/bin/blastn 2a
Command error: ps: /athena/opt/bioconda/2021.05/lib/libuuid.so.1: no version information available (required by /lib64/libblkid.so.1) ps: /athena/opt/bioconda/2021.05/lib/libuuid.so.1: no version information available (required by /lib64/libblkid.so.1) ps: /athena/opt/bioconda/2021.05/lib/libuuid.so.1: no version information available (required by /lib64/libblkid.so.1) ps: /athena/opt/bioconda/2021.05/lib/libuuid.so.1: no version information available (required by /lib64/libblkid.so.1) Error: NCBI C++ Exception: T0 "/opt/conda/conda-bld/blast_1607337341665/work/blast/c++/src/serial/objistrasnb.cpp", line 499: Error: (CSerialException::eOverflow) byte 92: overflow error ( at [].[].gi) T0 "/opt/conda/conda-bld/blast_1607337341665/work/blast/c++/src/serial/member.cpp", line 768: Error: (CSerialException::eOverflow) ncbi::CMemberInfoFunctions::ReadWithSetFlagMember() - error while reading seqid ( at Blast-def-line-set.[].[].seqid.[].[].gi)
Hello, I'm trying to run NanoCLUST with my 16S sequence data. I run it on a Linux CentOS 7 machine. Also with the test data I get an error reaching the 'consensus_classification' module:
When I use the command:
nextflow run main.nf -profile test,docker
I get the following terminal output:
Many thanks, Fritz