jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
369 stars 79 forks source link

Error step 15 DAS #12

Closed kassammo closed 5 years ago

kassammo commented 5 years ago

Hello

I am getting the error on step 15

-bash-4.2$ perl /home//metagenomics/2_Scripts/SqueezeMeta/scripts/restart.pl SqCoa [0 seconds]: STEP15 -> DAS_TOOL MERGING: 15.dastool.pl mkdir: cannot create directory ‘/home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/DAS’: File exists which: no usearch in (/home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin:/usr/lib64/qt-3.3/bin:/usr/lib64/mpich/bin:/usr/local/cuda/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home//.local/bin:/home//bin:/usr/lib64/qt-3.3/bin:/usr/lib64/mpich/bin:/usr/local/cuda/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home//.local/bin:/home//bin:/opt/ncbi-blast-2.7.1+/bin/:::::) /home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin/DAS_Tool/DAS_Tool: line 241: usearch: command not found mv: cannot stat ‘/home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/03.SqCoa.faa.scg’: No such file or directory mv: cannot stat ‘/home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/03.SqCoa.faa.scg’: No such file or directory rm: cannot remove ‘/home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/03.SqCoa.faa.all.b6’: No such file or directory Error running command: PATH=/home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin:$PATH /home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin/DAS_Tool/DAS_Tool -i /home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/DAS/maxbin.table,/home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/DAS/metabat2.table -l maxbin,metabat2 -c /home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/01.SqCoa.fasta --write_bins 1 --proteins /home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/03.SqCoa.faa --score_threshold 0.25 --search_engine diamond -t 38 -o /home//metagenomics/3_Output/metaG/SQUEEZMETA/SqCoa/results/DAS/SqCoa --db_directory /home//metagenomics/1_Input/metaG/Squeezmeta/db at /home//metagenomics/2_Scripts/SqueezeMeta/scripts/../scripts/15.dastool.pl line 63. Stopping in STEP15 -> 15.dastool.pl

In the syslog I have this error: identifying single copy genes using diamond version 0.9.22 single copy gene prediction using diamond failed. Aborting identifying single copy genes using diamond version 0.9.22 single copy gene prediction using diamond failed. Aborting identifying single copy genes using diamond version 0.9.22 single copy gene prediction using diamond failed. Aborting identifying single copy genes using diamond version 0.9.22 single copy gene prediction using diamond failed. Aborting

Thanks

fpusan commented 5 years ago

Hi! What version of SqueezeMeta are you using? Did it work with the test dataset?

kassammo commented 5 years ago

Hi,

i am using version SqueezeMeta v0.4.2, Jan 2019.

I will try to analyse with the tesdata to see if I get the same error.

thanks

fpusan commented 5 years ago

If it doesn't work with the test data, try removing v0.4.2, installing v0.4.3 and running the .../preparing_databases/download_databases.pl script.

kassammo commented 5 years ago

hello,

So I tried with the new version v0.4.3 and with the compiled database. i am getting the same error.

which: no usearch in (/home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin:/usr/lib64/qt-3.3/bin:/usr/lib64/mpich/bin:/usr/local/cuda/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home//.local/bin:/home//bin:/usr/lib64/qt-3.3/bin:/usr/lib64/mpich/bin:/usr/local/cuda/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home//.local/bin:/home//bin:/opt/ncbi-blast-2.7.1+/bin/:::::) /home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin/DAS_Tool/DAS_Tool: line 241: usearch: command not found mv: cannot stat ‘/home//testSQNEW/SqCoa_Julius2/results/03.SqCoa_Julius2.faa.scg’: No such file or directory mv: cannot stat ‘/home//testSQNEW/SqCoa_Julius2/results/03.SqCoa_Julius2.faa.scg’: No such file or directory rm: cannot remove ‘/home//testSQNEW/SqCoa_Julius2/results/03.SqCoa_Julius2.faa.all.b6’: No such file or directory Error running command: PATH=/home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin:$PATH /home//metagenomics/2_Scripts/SqueezeMeta/scripts/../bin/DAS_Tool/DAS_Tool -i /home//testSQNEW/SqCoa_Julius2/results/DAS/maxbin.table,/home//testSQNEW/SqCoa_Julius2/results/DAS/metabat2.table -l maxbin,metabat2 -c /home//testSQNEW/SqCoa_Julius2/results/01.SqCoa_Julius2.fasta --write_bins 1 --proteins /home//testSQNEW/SqCoa_Julius2/results/03.SqCoa_Julius2.faa --score_threshold 0.25 --search_engine diamond -t 39 -o /home//testSQNEW/SqCoa_Julius2/results/DAS/SqCoa_Julius2 --db_directory /home//metagenomics/1_Input/metaG/Squeezmeta/db at /home//metagenomics/2_Scripts/SqueezeMeta/scripts/../scripts/15.dastool.pl line 63. Stopping in STEP15 -> 15.dastool.pl

In the log file I have

identifying single copy genes using diamond version 0.9.22 single copy gene prediction using diamond failed. Aborting

thanks

fpusan commented 5 years ago

I see that the project name is SqCoa_Julius2. I assume that these are your own samples. Is this also happening with the test data?

Also can you just run the DAS_tool script?

/home/metagenomics/2_Scripts/SqueezeMeta/scripts/15.dastool.pl SqCoa_Julius2

It should be a bit more verbose on what could be wrong.

pbaCamille commented 5 years ago

Hello, I also getting the same error at this step with the test data. Did you solved the issue ? working on it maybe ?

Running DAS Tool for maxbin,metabat2
PATH=../../scripts/../bin:$PATH ../../scripts/../bin/DAS_Tool/DAS_Tool -i /sandbox/users/c/SqueezeMeta/databases/test/Hadza /results/DAS/maxbin.table,/sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/metabat2.table -l maxbin,metabat2 -c /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/01.Hadza.fasta --write_bins 1 --proteins /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa --score_threshold 0.25 --search_engine diamond -t 12 -o /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/Hadza --db_directory /sandbox/users/c/SqueezeMeta/databases/db which: no usearch in (../../scripts/../bin:/sandbox/users/c/miniconda3/envs/squeeze/bin:/sandbox/users/c/miniconda 3/bin:/opt/sge/bin:/opt/sge/bin/lx-amd64:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/puppetlabs/bin:/sandbox/users/c/.local/bin:/sandbox/users/c/bin:::::) ../../scripts/../bin/DAS_Tool/DAS_Tool: ligne241: usearch : command not found Running DAS Tool using 12 threads. identifying single copy genes using diamond version 0.9.22 mv:can't evaluate « /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa.scg »: No such file or directory mv: impossible d'évaluer « /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa.scg »: No such file or directory rm: impossible de supprimer « /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa.all.b6 »: No such file or directory single copy gene prediction using diamond failed. Aborting Error running command: PATH=../../scripts/../bin:$PATH ../../scripts/../bin/DAS_Tool/DAS_Tool -i /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/maxbin.table,/sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/metabat2. table -l maxbin,metabat2 -c /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/01.Hadza.fasta --write_bins 1 --prote ins /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa --score_threshold 0.25 --search_engine diamond - t 12 -o /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/Hadza --db_directory /sandbox/users/c/SqueezeMeta/databases/db at ../../scripts/15.dastool.pl line 63.

fpusan commented 5 years ago

Dear Camille,

Can you please try to run pullseq in the SqueezeMeta/bin directory? Just executing it with no options, or trying to get the help menu, should be enough. Does it ask for any missing library? Also what is your OS?

pbaCamille commented 5 years ago

No missing library I'm working on distant server so i built a conda env. I don't have sudo access rights. Maybe it's linked to this step of the installation ?

Patch libpcre so that the Ubuntu-compiled pullseq work in CentOS7.

sudo link /usr/lib64/libpcre.so.1 /usr/lib64/libpcre.so.3

$ cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"

$ pullseq pullseq - a bioinformatics tool for manipulating fasta and fastq files Version: 1.0.2 Name lookup method: UTHASH (Written by bct - copyright 2012-2015)

Usage: pullseq -i <input fasta/fastq file> -n

pullseq -i <input fasta/fastq file> -m pullseq -i <input fasta/fastq file> -g pullseq -i <input fasta/fastq file> -m -a pullseq -i <input fasta/fastq file> -t

cat | pullseq -i <input fasta/fastq file> -N

Options: -i, --input, Input fasta/fastq file (required) -n, --names, File of header id names to search for -N, --names_stdin, Use STDIN for header id names -g, --regex, Regular expression to match (PERL compatible; always case-insensitive) -m, --min, Minimum sequence length -a, --max, Maximum sequence length -l, --length, Sequence characters per line (default 50) -c, --convert, Convert input to fastq/fasta (e.g. if input is fastq, output will be fasta) -q, --quality, ASCII code to use for fasta->fastq quality conversions -e, --excluded, Exclude the header id names in the list (-n) -t, --count, Just count the possible output, but don't write it -h, --help, Display this help and exit -v, --verbose, Print extra details during the run --version, Output version information and exit

fpusan commented 5 years ago

That might be it, although I was expecting it to complain for missing libraries. We've been having trouble with compuling pullseq binaries that can run in Ubuntu and CentOS at the same time, that's usually what causes DAS_tool issues.

Can you replace the DAS_tool script in the SqueezeMeta/bin/DAS_tool/ by the one I'm providing here?

Then re-run the DAS_tool step with

/scripts/15.dastool.pl Hadza and paste the error log. This should hopefully be a little more informative. [DAS_Tool.zip](https://github.com/jtamames/SqueezeMeta/files/2956263/DAS_Tool.zip)
pbaCamille commented 5 years ago

Ok i see something about a the library now pullseq: error while loading shared libraries: libpcre.so.3: cannot open shared object file: No such file or directory

Error log: mkdir: unable to create directory « /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS »: File already exists which: no usearch in (../../scripts/../bin:/sandbox/users/c/miniconda3/envs/squeeze/bin:/sandbox/users/c/miniconda3/bin:/opt/sge/bin:/opt/sge/bin/lx-amd64:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/puppetlabs/bin:/sandbox/users/c/.local/bin:/sandbox/users/c/bin:::::) ../../scripts/../bin/DAS_Tool/DAS_Tool: ligne241: usearch : command not found /sandbox/users/c/SqueezeMeta/bin/DAS_Tool/src/scg_blank_diamond.rb:39: warning: Insecure world writable dir /sandbox in PATH, mode 040777 pullseq: error while loading shared libraries: libpcre.so.3: cannot open shared object file: No such file or directory Error: Error detecting input file format. First line seems to be blank. verifying blast did not work mv: can't evaluate « /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa.scg »: No such file or directory /sandbox/users/c/SqueezeMeta/bin/DAS_Tool/src/scg_blank_diamond.rb:39: warning: Insecure world writable dir /sandbox in PATH, mode 040777 pullseq: error while loading shared libraries: libpcre.so.3: cannot open shared object file: No such file or directory Error: Error detecting input file format. First line seems to be blank. verifying blast did not work mv: can't evaluate « /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa.scg »: No such file or directory rm: can't remove « /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa.all.b6 »: No such file or directory Error running command: PATH=../../scripts/../bin:$PATH ../../scripts/../bin/DAS_Tool/DAS_Tool -i /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/maxbin.table,/sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/metabat2.table -l maxbin,metabat2 -c /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/01.Hadza.fasta --write_bins 1 --proteins /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/03.Hadza.faa --score_threshold 0.25 --search_engine diamond -t 12 -o /sandbox/users/c/SqueezeMeta/databases/test/Hadza/results/DAS/Hadza --db_directory /sandbox/users/c/SqueezeMeta/databases/db at ../../scripts/15.dastool.pl line 63.

fpusan commented 5 years ago

Nice! Definitely pullseq is to blame. Please try downloading the binaries from https://github.com/bcthomas/pullseq/releases/download/1.0.2/pullseq_v1.0.2_linux64.zip and putting pullseq in the SqueezeMeta/bin folder (instead of the one we provided).

If it doesn't, you can try to compile pullseq from source. Apparently there's also a pullseq package from conda (I'm not familiar with conda, but I assume it will produce a binary that works in your particular environment...).

fpusan commented 5 years ago

Just pushed commit 30ed6d12f98d4ab54ed85199bbc64434a5872854, which should fix this issue.