hsgweon / pipits

Automated pipeline for analyses of fungal ITS from the Illumina
GNU General Public License v3.0
30 stars 16 forks source link

Probable NumPy issue with pipits_process #22

Closed pamelarussell closed 5 years ago

pamelarussell commented 5 years ago

Hi,

I am working my way through the instructions in your README with your test dataset. I've made it to the final step. I am getting stuck in pipits_process with the following output:

pipits_process -i out_funits/ITS.fasta -o out_process
pipits_process 2.2, the PIPITS Project
https://github.com/hsgweon/pipits
---------------------------------

2018-10-11 11:25:31 pipits_process started
2018-10-11 11:25:31 Generating a sample list from the input sequences
2018-10-11 11:25:32 Dereplicating and removing unique sequences prior to picking OTUs
2018-10-11 11:25:32 Picking OTUs [VSEARCH]
2018-10-11 11:25:32 Removing chimeras [VSEARCH]
2018-10-11 11:25:37 Renaming OTUs
2018-10-11 11:25:37 Mapping reads onto centroids [VSEARCH]
2018-10-11 11:25:37 Making OTU table
2018-10-11 11:25:37 Converting classic tabular OTU into a BIOM format [BIOM]
2018-10-11 11:25:37 Error: None zero returncode: biom convert -i out_process/intermediate/otu_table_prelim.txt -o out_process/intermediate/otu_table_prelim.biom --table-type="OTU table" --to-json

When I run the problem command directly from the command line I get this:

biom convert -i out_process/intermediate/otu_table_prelim.txt -o out_process/intermediate/otu_table_prelim.biom --table-type="OTU table" --to-json
Traceback (most recent call last):
  File "/Fingerlin/home/russellp/.conda/envs/pipits_env/bin/biom", line 7, in <module>
    from biom.cli import cli
  File "/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python3.6/site-packages/biom/__init__.py", line 51, in <module>
    from .table import Table
  File "/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python3.6/site-packages/biom/table.py", line 176, in <module>
    import numpy as np
  File "/usr/lib/python3.5/site-packages/numpy/__init__.py", line 142, in <module>
    from . import add_newdocs
  File "/usr/lib/python3.5/site-packages/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/usr/lib/python3.5/site-packages/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/usr/lib/python3.5/site-packages/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/usr/lib/python3.5/site-packages/numpy/core/__init__.py", line 14, in <module>
    from . import multiarray
ImportError: cannot import name 'multiarray'

Some Googling suggests that this issue can be solved by deleting and reinstalling NumPy. However, I have not been able to change the version or reinstall NumPy due to the dependencies of PIPITS. Can you suggest a solution for this issue?

Thanks! Pam

hsgweon commented 5 years ago

Hi Pam,

  1. Can you let me know which system you are running PIPITS on? It is MacOS, Ubuntu, Debian etc.? (I only test PIPITS on Ubuntu and MacOS)

  2. Can you please try:

source activate pipits_env
python --version
conda list numpy

and let me know the outcome?

pamelarussell commented 5 years ago

Hi, thanks for your quick reply.

Here is our Linux distribution:

cat /etc/os-release
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
ID_LIKE=archlinux
ANSI_COLOR="0;36"
HOME_URL="https://www.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"

Python version:

python --version
Python 3.6.6

NumPy version:

conda list numpy
/software/cgeh/conda/daily/install/lib/python3.6/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.19.1) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
# packages in environment at /Fingerlin/home/russellp/.conda/envs/pipits_env:
#
numpy                     1.15.2          py36_blas_openblashd3ea46f_1  [blas_openblas]  conda-forge
numpy-base                1.14.3           py36h0ea5e3f_1 
hsgweon commented 5 years ago

Hmm... ok... I've tested with the latest PIPITS and dependencies on Ubuntu and I can't replicate this problem. Wondering if it's todo with Arch Linux...

Anyway, can you let me know what you get when you type:

python -c 'import sys; print(sys.path)'

Essentially this should only show directories with "pipits_env" in them...

Also can you just try to update everything and see if everything works as intended (probably not...) by:

source activate pipits_env
conda update --all
pamelarussell commented 5 years ago

Here is the sys.path statement inside the pipits environment:

(09:17  > source activate pipits_env
(pipits_env) (09:17 russellp@c05  > python -c 'import sys; print(sys.path)'
['', '/usr/lib/python3.5/site-packages', '/home/russellp/Software/Python-3.5.2/lib/python3.5', '/<path omitted>/pipits_testing', '/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python36.zip', '/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python3.6', '/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python3.6/lib-dynload', '/Fingerlin/home/russellp/.local/lib/python3.6/site-packages', '/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python3.6/site-packages']

I did conda update --all and then ran into an issue with vsearch. I recognized this from having a problem with it in an earlier stage of the pipeline, where I had followed the advice in issue #7 to install vsearch 2.8.0. After doing that again, I'm back to the original error.

Before installing vsearch 2.8.0:

> pipits_process -i out_funits/ITS.fasta -o out_process
pipits_process 2.2, the PIPITS Project
https://github.com/hsgweon/pipits
---------------------------------

2018-10-12 09:27:35 pipits_process started
2018-10-12 09:27:35 Generating a sample list from the input sequences
2018-10-12 09:27:36 Dereplicating and removing unique sequences prior to picking OTUs
2018-10-12 09:27:36 Error: None zero returncode: vsearch --derep_fulllength out_funits/ITS.fasta --output out_process/intermediate/input_nr.fasta --minuniquesize 2 --sizeout --threads 1

After installing vsearch 2.8.0:

> pipits_process -i out_funits/ITS.fasta -o out_process
pipits_process 2.2, the PIPITS Project
https://github.com/hsgweon/pipits
---------------------------------

2018-10-12 09:32:09 pipits_process started
2018-10-12 09:32:09 Generating a sample list from the input sequences
2018-10-12 09:32:09 Dereplicating and removing unique sequences prior to picking OTUs
2018-10-12 09:32:09 Picking OTUs [VSEARCH]
2018-10-12 09:32:09 Removing chimeras [VSEARCH]
2018-10-12 09:32:14 Renaming OTUs
2018-10-12 09:32:14 Mapping reads onto centroids [VSEARCH]
2018-10-12 09:32:14 Making OTU table
2018-10-12 09:32:15 Converting classic tabular OTU into a BIOM format [BIOM]
2018-10-12 09:32:15 Error: None zero returncode: biom convert -i out_process/intermediate/otu_table_prelim.txt -o out_process/intermediate/otu_table_prelim.biom --table-type="OTU table" --to-json

Trying the command directly:

> biom convert -i out_process/intermediate/otu_table_prelim.txt -o out_process/intermediate/otu_table_prelim.biom --table-type="OTU table" --to-json
Traceback (most recent call last):
  File "/Fingerlin/home/russellp/.conda/envs/pipits_env/bin/biom", line 7, in <module>
    from biom.cli import cli
  File "/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python3.6/site-packages/biom/__init__.py", line 51, in <module>
    from .table import Table
  File "/Fingerlin/home/russellp/.conda/envs/pipits_env/lib/python3.6/site-packages/biom/table.py", line 176, in <module>
    import numpy as np
  File "/usr/lib/python3.5/site-packages/numpy/__init__.py", line 142, in <module>
    from . import add_newdocs
  File "/usr/lib/python3.5/site-packages/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/usr/lib/python3.5/site-packages/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/usr/lib/python3.5/site-packages/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/usr/lib/python3.5/site-packages/numpy/core/__init__.py", line 14, in <module>
    from . import multiarray
ImportError: cannot import name 'multiarray'
hsgweon commented 5 years ago

Can you please try to remove the environment and try again, but with unset PYTHONPATH before doing so?

conda env remove --name pipits_env
unset PYTHONPATH
conda create -n pipits_env python=3.6 pipits

Then into conda env, and downgrade VSEARCH:

source activate pipits_env
conda install vsearch=2.8.0
pamelarussell commented 5 years ago

This solved the problem and it now runs through to completion. Thanks so much for your help!!