bxlab / metaWRAP

MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
MIT License
394 stars 191 forks source link

checkm presents a new error with the latest versions of metawrap #153

Open rotoscan opened 5 years ago

rotoscan commented 5 years ago

Dear Gherman,

again, thanks a lot for this amazing tool!

Lately, however, I've been having problems when using checkm related modules, such as binning or binning refinement.

Please see my log below:

error.out:

# verbose

*******************************************************************************
 [CheckM - tree] Placing bins in reference genome tree.
*******************************************************************************

  Identifying marker genes in 28 bins with 8 threads:
Process SyncManager-1:
Traceback (most recent call last):
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
    self.run()
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
    server = cls._Server(registry, address, authkey, serializer)
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/multiprocessing/managers.py", line 162, in __init__
    self.listener = Listener(address=address, backlog=16)
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/multiprocessing/connection.py", line 132, in __init__
    self._listener = SocketListener(address, family, backlog)
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/multiprocessing/connection.py", line 256, in __init__
    self._socket.bind(address)
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: AF_UNIX path too long
Traceback (most recent call last):
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/bin/checkm", line 711, in <module>
    checkmParser.parseOptions(args)
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/site-packages/checkm/main.py", line 1252, in parseOptions
    self.tree(options)
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/site-packages/checkm/main.py", line 134, in tree
    options.bCalledGenes)
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/site-packages/checkm/markerGeneFinder.py", line 67, in find
    binIdToModels = mp.Manager().dict()
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/multiprocessing/__init__.py", line 99, in Manager
    m.start()
  File "/data/msb/tools/miniconda/miniconda2/envs/metawrap-env-2/lib/python2.7/multiprocessing/managers.py", line 528, in start
    self._address = reader.recv()
EOFError

real    20m58.555s
user    98m23.702s
sys 2m51.321s

and std.out

------------------------------------------------------------------------------------------------------------------------
-----                              metaBAT2 finished successfully, and found 28 bins!                              -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                               Running CheckM on                                              -----
-----      /work/brizolat/MuDoGer/intermediary_files/pbin/initial/P19001_419-090316-mw2-wcheckm/metabat2_bins      -----
-----                                                     bins                                                     -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----           There is 4 RAM and 8 threads available, and each pplacer thread uses <40GB, so I will use          -----
-----                                            0 threads for pplacer                                             -----
------------------------------------------------------------------------------------------------------------------------

Unexpected error: <type 'exceptions.EOFError'>

************************************************************************************************************************
*****                             Something went wrong with running CheckM. Exiting...                             *****
******************

I didn't have this before on versions 1.1.4 and previous ones...It started happening on the 1.1.5. This one I am working with now is the latest 1.1.8.

In order to keep things working, I willl attempt to install the 1.1.4 with: $ conda install metawrap-mg=1.1.4

I hope my message was clear and I appreciate any sort of help!

Thanks a lot! Cheers, Rodolfo

ursky commented 5 years ago

Thanks for the report! MetaWRAP v1.1.8 comes loaded with more streamlined dependencies for easier integration and CONCOCT v1.0 dependency conflicts, but there will likely be issues at first. Keep them coming.

As for the checkm errror, this could be just a matter of your path being too long (see https://github.com/bxlab/metaWRAP/issues/35), or if you are using the same directories as before this could also stem from the newer CheckM v1.0.13. I cannot replicate the error, so can I ask you to test this for me, please?

Make a new conda environment, then install the metawrap dependancies only:

conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda config --add channels ursky

conda install --only-deps -c ursky metawrap-mg=1.1.8

Then clone/download this repository, and put all the metawrap/bin/ contents into somewhere in PATH (such as /usr/bin/ for example), or just add metawrap/bin/ to PATH (if you know how). This way you still have metawrap v1.1.8, but now you have control over the dependancies, so you can downgrade CheckM, and try running metawrap binning again.

conda install checkm-genome=1.0.12

Can you report back if this fixes the issue?

rotoscan commented 5 years ago
Dear Gherman, thanks a lot for the quick reply. Indeed I was using those directories before with no problems. It started happening after the last updates. I will test immediately the thing you suggested and I will get back to you! Thanks!! Rodolfo Am 05/04/19 15:06 schrieb "Gherman V. Uritskiy" : > > > Thanks for the report! MetaWRAP v1.1.8 comes loaded with more streamlined dependencies for easier integration and CONCOCT v1.0 dependency conflicts, but there will likely be issues at first. Keep them coming. > As for the checkm errror, this could be just a matter of your path being too long (see #35(https://github.com/bxlab/metaWRAP/issues/35)), or if you are using the same directories as before this could also stem from the newer CheckM v1.0.13. I cannot replicate the error, so can I ask you to test this for me, please? > Make a new conda environment, then install the metawrap dependancies only: > > conda config --add channels defaultsconda config --add channels conda-forgeconda config --add channels biocondaconda config --add channels urskyconda install --only-deps -c ursky metawrap-mg=1.1.8 > > Then clone/download this repository, and put all the metawrap/bin/ contents into somewhere in PATH (such as /usr/bin/ for example), or just add metawrap/bin/ to PATH (if you know how). This way you still have metawrap v1.1.8, but now you have control over the dependancies, so you can downgrade CheckM, and try running metawrap binning again. > > conda install checkm-genome=1.0.12 > > Can you report back if this fixes the issue? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub(https://github.com/bxlab/metaWRAP/issues/153#issuecomment-480268087), or mute the thread(https://github.com/notifications/unsubscribe-auth/AfEAOqsRcHGUPNDP1l1CJqW8Uy2tS7Dqks5vd0owgaJpZM4ceq5Q). > > -- Rodolfo Brizola Toscan Techniker/Technician Department Umweltmikrobiologie/Department of Environmental Microbiology Helmholtz-Zentrum für Umweltforschung GmbH - UFZ Helmholtz Centre for Environmental Research GmbH - UFZ Permoserstraße 15, 04318 Leipzig, Germany rodolfo.toscan@ufz.de@ufz.de, www.ufz.de Sitz der Gesellschaft/Registered Office: Leipzig Registergericht/Registration Office: Amtsgericht Leipzig Handelsregister Nr./Trade Register Nr.: B 4703 Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig Wilfried Kraus Wissenschaftlicher Geschäftsführer/Scientific Managing Director: Prof. Dr. Georg Teutsch Administrative Geschäftsführerin/ Administrative Managing Director: Prof. Dr. Heike Graßmann (komm./interim)
rotoscan commented 5 years ago

Dear Gherman,

just to clarify: I createD the conda environment and ran conda config commands and conda install. Then I cloned this repository. Now should copy and paste the content of this repository:


($ ls metaWRAP/bin/
config-metawrap  metawrap  metaWRAP  metawrap-modules  metawrap-scripts)

into the conda environment bin folder? I work in a cluster so I don't have access to the /usr/bin folder.

ursky commented 5 years ago

Yes, that would be the easiest!

rotoscan commented 5 years ago

Dear Gherman,

unfortunately I came across the exact same error. `*** [CheckM - tree] Placing bins in reference genome tree.


Identifying marker genes in 14 bins with 10 threads: Process SyncManager-1: Traceback (most recent call last): File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, *self._kwargs) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/managers.py", line 550, in _run_server server = cls._Server(registry, address, authkey, serializer) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/managers.py", line 162, in init self.listener = Listener(address=address, backlog=16) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/connection.py", line 132, in init self._listener = SocketListener(address, family, backlog) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/connection.py", line 256, in init self._socket.bind(address) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(args) error: AF_UNIX path too long Traceback (most recent call last): File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/bin/checkm", line 708, in checkmParser.parseOptions(args) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/site-packages/checkm/main.py", line 1251, in parseOptions self.tree(options) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/site-packages/checkm/main.py", line 133, in tree options.bCalledGenes) File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/site-packages/checkm/markerGeneFinder.py", line 67, in find binIdToModels = mp.Manager().dict() File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/init.py", line 99, in Manager m.start() File "/data/msb/tools/metawrap/THIRD_INSTALL/metawrap_env_3/lib/python2.7/multiprocessing/managers.py", line 528, in start self._address = reader.recv() EOFError

real 4m34.304s user 25m6.791s sys 0m56.264s `

I wonder if the path is too long. I will change my working directory to something "less deep" and see what happens. I will report here afterwards.

ursky commented 5 years ago

Sounds good. Thats what https://github.com/bxlab/metaWRAP/issues/35 did...

jeffkimbrel commented 5 years ago

I was getting the same error, and can confirm reducing the length of the file path allowed the checkM portion to correctly begin in the reassemble_bins module. Thanks.

rzhan186 commented 3 years ago

Hey guys,

I know it's been a while, but I came across the same "OSError: AF_UNIX path too long" error. This may sound silly, but I am wondering how can I change the length of a file path?

BTW I am running Metawrap using a virtual python environment in a cluster, rather than Conda.

Here are the error messages I received:

[2021-06-27 22:26:11] INFO: [CheckM - tree] Placing bins in reference genome tree. [2021-06-27 22:26:13] INFO: Identifying marker genes in 57 bins with 160 threads: Process SyncManager-1: Traceback (most recent call last): File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/managers.py", line 608, in _run_server server = cls._Server(registry, address, authkey, serializer) File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/managers.py", line 154, in init self.listener = Listener(address=address, backlog=16) File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/connection.py", line 448, in init self._listener = SocketListener(address, family, backlog) File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/connection.py", line 591, in init self._socket.bind(address) OSError: AF_UNIX path too long

Unexpected error: <class 'EOFError'> Traceback (most recent call last): File "/gpfs/fs0/scratch/c/careg421/ruizhang/metawrap_py2/bin/checkm", line 611, in checkmParser.parseOptions(args) File "/gpfs/fs0/scratch/c/careg421/ruizhang/metawrap_py2/lib/python3.8/site-packages/checkm/main.py", line 858, in parseOptions self.tree(options) File "/gpfs/fs0/scratch/c/careg421/ruizhang/metawrap_py2/lib/python3.8/site-packages/checkm/main.py", line 119, in tree binIdToModels = mgf.find(binFiles, File "/gpfs/fs0/scratch/c/careg421/ruizhang/metawrap_py2/lib/python3.8/site-packages/checkm/markerGeneFinder.py", line 67, in find binIdToModels = mp.Manager().dict() File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/context.py", line 57, in Manager m.start() File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/managers.py", line 583, in start self._address = reader.recv() File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/connection.py", line 250, in recv buf = self._recv_bytes() File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes buf = self._recv(4) File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/multiprocessing/connection.py", line 383, in _recv raise EOFError EOFError

Thanks for helping!

Rui

rotoscan commented 3 years ago

Dear Rui,

this is a problem easy to work around. You might need to move our input and output directories to a 'shorter' path. Basically create a folder not so deep inside your home or data folder and restart the process. Instead of path/to/my//folder/folder/folder/folder/folder/data, you want to move your data to /path/to/my/data

I hope it works! Best Rodolfo

rzhan186 commented 3 years ago

Hi Rodolfo,

Thank you for the clarification, it worked! My original data was trapped in a dungeon 😂

Just for future reference, both input and output data paths have to be changed to a "not too deep" directory.

Rui

rotoscan commented 3 years ago

I am glad it worked!

Wish you a prosper computing =)