bxlab / metaWRAP

MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
MIT License
389 stars 189 forks source link

metawrap v1.2.2, binning with maxbin2, perl5 library issue #207

Open ganiatgithub opened 5 years ago

ganiatgithub commented 5 years ago

Hi,

I realize this issue has been raise a few times now, but I haven't found an easy fix. I'm using metawrap 1.2.2, this is the first binning, metabat2 works fine, but the binning module stops at maxbin2. How would you suggest to fix it? It's interesting that I didn't have this perl5 library issue with binning, but annotation previously with metawrap 1.2.1. #193

Here is the log:

metawrap binning -a /30days/uqgni1/16_chen_metag/assembly_out/S1/S1_final_assembly.fasta -o /30days/uqgni1/16_chen_metag/binning_out --metabat2 --maxbin2 --concoct /30days/uqgni1/16_chen_metag/read_qc_out/S1/S1_final_pure_reads_1.fastq /30days/uqgni1/16_chen_metag/read_qc_out/S1/S1_final_pure_reads_2.fastq --run-checkm -m 250 -t 12

------------------------------------------------------------------------------------------------------------------------
-----                                           Entered read type: paired                                          -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                  1 forward and 1 reverse read files detected                                 -----
------------------------------------------------------------------------------------------------------------------------

########################################################################################################################
#####                                     ALIGNING READS TO MAKE COVERAGE FILES                                    #####
########################################################################################################################

------------------------------------------------------------------------------------------------------------------------
-----                                         making copy of assembly file                                         -----
-----                     /30days/uqgni1/16_chen_metag/assembly_out/S1/S1_final_assembly.fasta                     -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                            Indexing assembly file                                            -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----             Aligning /30days/uqgni1/16_chen_metag/read_qc_out/S1/S1_final_pure_reads_1.fastq and             -----
-----           /30days/uqgni1/16_chen_metag/read_qc_out/S1/S1_final_pure_reads_2.fastq back to assembly           -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                Sorting the S1_final_pure_reads alignment file                                -----
------------------------------------------------------------------------------------------------------------------------

########################################################################################################################
#####                                               RUNNING METABAT2                                               #####
########################################################################################################################

------------------------------------------------------------------------------------------------------------------------
-----                                          making contig depth file...                                         -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                       Starting binning with metaBAT2...                                      -----
------------------------------------------------------------------------------------------------------------------------

MetaBAT 2 (v2.12.1) using minContig 1500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, and maxEdges 200. 
118 bins (345000522 bases in total) formed.

------------------------------------------------------------------------------------------------------------------------
-----                              metaBAT2 finished successfully, and found 119 bins!                             -----
------------------------------------------------------------------------------------------------------------------------

########################################################################################################################
#####                                                RUNNING MAXBIN2                                               #####
########################################################################################################################

------------------------------------------------------------------------------------------------------------------------
-----                                          making contig depth file...                                         -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                    split master contig depth file into individual files for maxbin2 input                    -----
------------------------------------------------------------------------------------------------------------------------

processing S1_final_pure_reads.bam depth file...

------------------------------------------------------------------------------------------------------------------------
-----             looks like our default perl libraries are not the conda ones. Manually setting perl5             -----
-----                                              library directory                                               -----
------------------------------------------------------------------------------------------------------------------------

metawrap path: /30days/uqgni1/tools/metaWRAP/bin/metawrap

************************************************************************************************************************
*****              /30days/uqgni1/tools/metaWRAP/lib/perl5/site_perl/5.22.0 does not exixt. Cannot set             *****
*****                                  manual path to perl5 libraries. Exiting...                                  *****
************************************************************************************************************************

Many thanks!

ursky commented 5 years ago

The program quits because simple running run_MaxBin.pl fails - you likely are missing some library. It then tries to manually link your conda libraries in lib/perl5/site_perl/5.22.0, but since yout metawrap location is not in a conda directory it cant find it. Try running run_MaxBin.pl to see which error it throws and go from there.

ganiatgithub commented 5 years ago

Hey,

The script run_MaxBin.pl is at: /30days/uqgni1/tools/Miniconda3/bin I got this error when I run run_MaxBin.pl:

Can't locate HTTP/Status.pm in @INC (you may need to install the HTTP::Status module) (@INC contains: /30days/uqgni1/tools/Miniconda3/lib/perl5/site_perl/5.22.0//x86_64-linux-thread-multi /30days/uqgni1/tools/Miniconda3/lib/perl5/site_perl/5.22.0/ /gpfs1/scratch/30days/uqgni1/tools/Miniconda3/lib/site_perl/5.26.2/x86_64-linux-thread-multi /gpfs1/scratch/30days/uqgni1/tools/Miniconda3/lib/site_perl/5.26.2 /gpfs1/scratch/30days/uqgni1/tools/Miniconda3/lib/5.26.2/x86_64-linux-thread-multi /gpfs1/scratch/30days/uqgni1/tools/Miniconda3/lib/5.26.2 .) at /gpfs1/scratch/30days/uqgni1/tools/Miniconda3/lib/site_perl/5.26.2/LWP/Simple.pm line 15.
BEGIN failed--compilation aborted at /gpfs1/scratch/30days/uqgni1/tools/Miniconda3/lib/site_perl/5.26.2/LWP/Simple.pm line 15.
Compilation failed in require at ./run_MaxBin.pl line 4.
BEGIN failed--compilation aborted at ./run_MaxBin.pl line 4.

I tried (based on what I understand from #132 ) export PERL5LIB=/30days/uqgni1/tools/Miniconda3/lib/perl5/site_perl/5.22.0/ Still the same error for run_MaxBin.pl

Also I tried to modify the binning.sh (based on #21 ), with the only change, doesn't help: conda_path=$(which metawrap) into conda_path=$(which conda)

From the following part:

    run_MaxBin.pl
    if [[ $? -ne 0 ]]; then
        comm "looks like our default perl libraries are not the conda ones. Manually setting perl5 library directory"
            conda_path=$(which conda)
        echo "metawrap path: $conda_path"
        conda_path=${conda_path%/*}
        if [ $(echo -n $conda_path | tail -c 1) = "/" ]; then conda_path=${conda_path%/*}; fi
        conda_path=${conda_path%/*}
        if [ ! -d ${conda_path}/lib/perl5/site_perl/5.22.0 ]; then 
            error "${conda_path}/lib/perl5/site_perl/5.22.0 does not exixt. Cannot set manual path to perl5 libraries. Exiting..."
        fi

Any suggestions?

ursky commented 5 years ago

It is throwing a very specific error that you are missing a Perl library, so you could work to just fix that: cpan install HTTP::Status.

ganiatgithub commented 5 years ago

Hi,

cpan install HTTP::Status worked. However, I got the following message in running maxbin2:

MaxBin 2.2.6 Input contig: /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/assembly.fa Thread: 1 Min contig length: 1000 out header: /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin Located abundance file [/30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/mb2_S1_final_pure_reads.txt] Searching against 107 marker genes to find starting seed contigs for [/30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/assembly.fa]... Running FragGeneScan.... Running HMMER hmmsearch.... Done data collection. Running MaxBin... Command: /gpfs1/scratch/30days/uqgni1/tools/Miniconda3/opt/MaxBin-2.2.6/src/MaxBin -fasta /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin.contig.tmp -abund /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin.contig.tmp.abund1 -seed /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin.seed -out /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin -min_contig_length 1000 Failed to get Abundance information for contig [NODE_1_length_1312426_cov_457.620833] in file [/30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin.contig.tmp.abund1] Error encountered while running core MaxBin program. Error recorded in /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin.log. Program Stop.

When I look into the bin.log file, it says: Failed to get Abundance information for contig [NODE_1_length_1312426_cov_457.620833] in file [/30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins/work_files/maxbin2_out/bin.contig.tmp.abund1]

I guess this is a separate issue from the perl5 library?

Also, my command for calling maxbin2 is: metawrap binning -a /30days/uqgni1/16_chen_metag/assembly_out/S1/S1_final_assembly.fasta -o /30days/uqgni1/16_chen_metag/binning_out/S1/maxbin2_bins --maxbin2 /30days/uqgni1/ 16_chen_metag/read_qc_out/S1/S1_final_pure_reads_1.fastq /30days/uqgni1/16_chen_metag/read_qc_out/S1/S1_final_pure_reads_2.fastq --run-checkm -m 250 -t 12

Where I specified using 12 cores, here only one is used. I don't mean to bother, but some suggestions for trouble-shooting are much appreciated.

Cheers.

jolespin commented 5 years ago

Is there a conda version of this command to keep the environments exportable? cpan install HTTP::Status?

EDIT: As mentioned in the documentation, this fixed my issue:

conda update perl -y
conda install -y blas=2.5=mkl
ursky commented 5 years ago

Sounds good.

jolespin commented 5 years ago

Unfortunately, I’m still having the concoct endless blas issue (apologies for all the notifications from other threads). Right now I’m running it with one thread to evade the issue. Is it possible to export your Conda environment for the working config?

ursky commented 5 years ago

Here is mine: metawrap-env.yml.zip

jolespin commented 5 years ago

Thanks for posting this, almost had it working:

(base) -bash-4.1$ time(conda env create --name metawrap_env --file metawrap-env.yml)
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - openblas==0.3.5=h9ac9557_1001
  - r-pillar==1.3.1=r351h6115d3f_1000
  - libtiff==4.0.10=h648cc4a_1001

real    0m27.392s
user    0m22.207s
sys 0m4.417s

I'll try and do a fresh install using the recommended methods in the install.

KanLI396 commented 4 years ago

@ursky hi, my metabins also has this problem as it can run only one thread, so how to use your .yml document here? I try to open and paste it under metawrap env, but it didn't work? @jolespin do you have any idea?