daler / pybedtools

Python wrapper -- and more -- for BEDTools (bioinformatics tools for "genome arithmetic")
http://daler.github.io/pybedtools
Other
297 stars 103 forks source link

NotImplementedError: "intersectBed" does not appear to be installed or on the path, so this method is disabled. Please install a more recent version of BEDTools and re-import to use this method. #361

Closed stanleyjs closed 2 years ago

stanleyjs commented 2 years ago

Hi @daler I am having this issue (same as #294) in January of 2022.

bedtools intersect and intersectBed are in $PATH.

Bedtools and htslib were installed from a requirements.yml file.

Running conda install --channel conda-forge --channel bioconda bedtools htslib returns #All requested packages already installed

bedtools --version is 2.30.0 and pybedtools 0.90.0 was installed from conda. Python version is 3.9

Here's the traceback that is relevant to your module. It's being called by the module AllCOOLS.


--> 396 black_feature = feature_bed.intersect(black_list_bed, f=f, wa=True)
File /opt/conda/envs/bipca-experiment/lib/python3.9/site-packages/pybedtools/bedtool.py:923, in BedTool._log_to_history.<locals>.decorated(self, *args, **kwargs)
    919 def decorated(self, *args, **kwargs):
    920 
    921     # this calls the actual method in the first place; *result* is
    922     # whatever you get back
--> 923     result = method(self, *args, **kwargs)
    925     # add appropriate tags
    926     parent_tag = self._tag

File /opt/conda/envs/bipca-experiment/lib/python3.9/site-packages/pybedtools/bedtool.py:244, in _wraps.<locals>.decorator.<locals>.not_implemented_func(*args, **kwargs)
    243 def not_implemented_func(*args, **kwargs):
--> 244     raise NotImplementedError(help_str)

NotImplementedError: "intersectBed" does not appear to be installed or on the path, so this method is disabled.  Please install a more recent version of BEDTools and re-import to use this method.
daler commented 2 years ago

Based on the stack trace, the conda env is bipca-experiment. Can you paste the results of conda env export -n bipca-experiment?

stanleyjs commented 2 years ago

Thanks for the fast reply @daler

name: bipca-experiment
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=1_llvm
  - alabaster=0.7.12=py_0
  - alsa-lib=1.2.3=h516909a_0
  - anndata=0.7.8=py39hf3d152e_1
  - anyio=3.5.0=py39hf3d152e_0
  - argon2-cffi=21.3.0=pyhd8ed1ab_0
  - argon2-cffi-bindings=21.2.0=py39h3811e60_1
  - arpack=3.7.0=hdefa2d7_2
  - asttokens=2.0.5=pyhd8ed1ab_0
  - attrs=21.4.0=pyhd8ed1ab_0
  - babel=2.9.1=pyh44b312d_0
  - backcall=0.2.0=pyh9f0ad1d_0
  - backports=1.0=py_2
  - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
  - bedtools=2.30.0=h7d7f7ad_2
  - black=22.1.0=pyhd8ed1ab_0
  - blas=2.113=openblas
  - blas-devel=3.9.0=13_linux64_openblas
  - bleach=4.1.0=pyhd8ed1ab_0
  - blosc=1.21.0=h9c3ff4c_0
  - brotli=1.0.9=h7f98852_6
  - brotli-bin=1.0.9=h7f98852_6
  - brotlipy=0.7.0=py39h3811e60_1003
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.18.1=h7f98852_0
  - ca-certificates=2021.10.8=ha878542_0
  - cached-property=1.5.2=hd8ed1ab_1
  - cached_property=1.5.2=pyha770c72_1
  - cairo=1.16.0=ha00ac49_1009
  - ccache=4.5.1=haef5404_0
  - certifi=2021.10.8=py39hf3d152e_1
  - cffi=1.15.0=py39h4bc2ebd_0
  - cftime=1.5.2=py39hce5d2b2_0
  - charset-normalizer=2.0.11=pyhd8ed1ab_0
  - click=8.0.3=py39hf3d152e_1
  - colorama=0.4.4=pyh9f0ad1d_0
  - cryptography=36.0.1=py39h95dcef6_0
  - curl=7.81.0=h2574ce0_0
  - cycler=0.11.0=pyhd8ed1ab_0
  - dataclasses=0.8=pyhc8e2a94_3
  - dbus=1.13.6=h5008d03_3
  - debugpy=1.5.1=py39he80948d_0
  - decorator=5.1.1=pyhd8ed1ab_0
  - defusedxml=0.7.1=pyhd8ed1ab_0
  - docutils=0.17.1=py39hf3d152e_1
  - dunamai=1.8.0=pyhd8ed1ab_0
  - entrypoints=0.3=py39hde42818_1002
  - executing=0.8.2=pyhd8ed1ab_0
  - expat=2.4.4=h9c3ff4c_0
  - flit-core=3.6.0=pyhd8ed1ab_0
  - font-ttf-dejavu-sans-mono=2.37=hab24e00_0
  - font-ttf-inconsolata=3.000=h77eed37_0
  - font-ttf-source-code-pro=2.038=h77eed37_0
  - font-ttf-ubuntu=0.83=hab24e00_0
  - fontconfig=2.13.94=ha180cfb_0
  - fonts-conda-ecosystem=1=0
  - fonts-conda-forge=1=0
  - fonttools=4.29.0=py39h3811e60_0
  - freetype=2.10.4=h0708190_1
  - get_version=3.5.3=pyhd8ed1ab_0
  - gettext=0.19.8.1=h73d1719_1008
  - git=2.35.0=pl5321hc30692c_0
  - glpk=4.65=h9202a9a_1004
  - gmp=6.2.1=h58526e2_0
  - graphite2=1.3.13=h58526e2_1001
  - gst-plugins-base=1.18.5=hf529b03_3
  - gstreamer=1.18.5=h9f60fe5_3
  - h5py=3.6.0=nompi_py39h7e08c79_100
  - harfbuzz=3.2.0=hb4a5f5f_0
  - hdf4=4.2.15=h10796ff_3
  - hdf5=1.12.1=nompi_h2750804_103
  - htslib=1.14=h5138463_1
  - icu=69.1=h9c3ff4c_0
  - idna=3.3=pyhd8ed1ab_0
  - igraph=0.9.6=ha184e22_0
  - imagesize=1.3.0=pyhd8ed1ab_0
  - importlib-metadata=4.10.1=py39hf3d152e_0
  - importlib_metadata=4.10.1=hd8ed1ab_0
  - importlib_resources=5.4.0=pyhd8ed1ab_0
  - ipykernel=6.7.0=py39hef51801_0
  - ipython=8.0.1=py39hf3d152e_0
  - ipython_genutils=0.2.0=py_1
  - ipywidgets=7.6.5=pyhd8ed1ab_0
  - jedi=0.18.1=py39hf3d152e_0
  - jinja2=3.0.3=pyhd8ed1ab_0
  - joblib=1.1.0=pyhd8ed1ab_0
  - jpeg=9e=h7f98852_0
  - json5=0.9.5=pyh9f0ad1d_0
  - jsonschema=4.4.0=pyhd8ed1ab_0
  - jupyter=1.0.0=py39hf3d152e_7
  - jupyter_client=7.1.2=pyhd8ed1ab_0
  - jupyter_console=6.4.0=pyhd8ed1ab_0
  - jupyter_core=4.9.1=py39hf3d152e_1
  - jupyter_server=1.13.4=pyhd8ed1ab_0
  - jupyterlab=3.2.8=pyhd8ed1ab_0
  - jupyterlab_pygments=0.1.2=pyh9f0ad1d_0
  - jupyterlab_server=2.10.3=pyhd8ed1ab_0
  - jupyterlab_widgets=1.0.2=pyhd8ed1ab_0
  - kiwisolver=1.3.2=py39h1a9c180_1
  - krb5=1.19.2=hcc1bbae_3
  - lcms2=2.12=hddcbb42_0
  - ld_impl_linux-64=2.36.1=hea4e1c9_2
  - legacy-api-wrap=1.2=py_0
  - leidenalg=0.8.8=py39he80948d_1
  - libblas=3.9.0=13_linux64_openblas
  - libbrotlicommon=1.0.9=h7f98852_6
  - libbrotlidec=1.0.9=h7f98852_6
  - libbrotlienc=1.0.9=h7f98852_6
  - libcblas=3.9.0=13_linux64_openblas
  - libclang=13.0.0=default_hc23dcda_0
  - libcurl=7.81.0=h2574ce0_0
  - libdeflate=1.9=h7f98852_0
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libevent=2.1.10=h9b69904_4
  - libffi=3.4.2=h7f98852_5
  - libgcc-ng=11.2.0=h1d223b6_12
  - libgfortran-ng=11.2.0=h69a702a_12
  - libgfortran5=11.2.0=h5c6108e_12
  - libglib=2.70.2=h174f98d_1
  - libhiredis=1.0.2=h2cc385e_0
  - libiconv=1.16=h516909a_0
  - liblapack=3.9.0=13_linux64_openblas
  - liblapacke=3.9.0=13_linux64_openblas
  - libllvm11=11.1.0=hf817b99_2
  - libllvm13=13.0.0=hf817b99_0
  - libnetcdf=4.8.1=nompi_hb3fd0d9_101
  - libnghttp2=1.46.0=h812cca2_0
  - libnsl=2.0.0=h7f98852_0
  - libogg=1.3.4=h7f98852_1
  - libopenblas=0.3.18=pthreads_h8fe5266_0
  - libopus=1.3.1=h7f98852_1
  - libpng=1.6.37=h21135ba_2
  - libpq=14.1=hd57d9b9_1
  - libsodium=1.0.18=h36c2ea0_1
  - libssh2=1.10.0=ha56f1ee_2
  - libstdcxx-ng=11.2.0=he4da1e4_12
  - libtiff=4.2.0=hf544144_3
  - libuuid=2.32.1=h7f98852_1000
  - libvorbis=1.3.7=h9c3ff4c_0
  - libwebp-base=1.2.2=h7f98852_1
  - libxcb=1.13=h7f98852_1004
  - libxkbcommon=1.0.3=he3ba5ed_0
  - libxml2=2.9.12=h885dcf4_1
  - libzip=1.8.0=h4de3113_1
  - libzlib=1.2.11=h36c2ea0_1013
  - llvm-openmp=12.0.1=h4bd325d_1
  - llvmlite=0.38.0=py39h1bbdace_0
  - lz4-c=1.9.3=h9c3ff4c_1
  - lzo=2.10=h516909a_1000
  - markupsafe=2.0.1=py39h3811e60_1
  - matplotlib=3.5.1=py39hf3d152e_0
  - matplotlib-base=3.5.1=py39h2fa2bec_0
  - matplotlib-inline=0.1.3=pyhd8ed1ab_0
  - metis=5.1.0=h58526e2_1006
  - mistune=0.8.4=py39h3811e60_1005
  - mkl=2022.0.1=h8d4b97c_803
  - mpfr=4.1.0=h9202a9a_1
  - munkres=1.1.4=pyh9f0ad1d_0
  - mypy_extensions=0.4.3=py39hf3d152e_4
  - mysql-common=8.0.28=ha770c72_0
  - mysql-libs=8.0.28=hfa10184_0
  - natsort=8.1.0=pyhd8ed1ab_0
  - nbclassic=0.3.5=pyhd8ed1ab_0
  - nbclient=0.5.10=pyhd8ed1ab_1
  - nbconvert=6.4.1=py39hf3d152e_0
  - nbformat=5.1.3=pyhd8ed1ab_0
  - ncurses=6.3=h9c3ff4c_0
  - nest-asyncio=1.5.4=pyhd8ed1ab_0
  - netcdf4=1.5.8=nompi_py39h64b754b_101
  - networkx=2.6.3=pyhd8ed1ab_1
  - nomkl=3.0=0
  - notebook=6.4.8=pyha770c72_0
  - nspr=4.32=h9c3ff4c_1
  - nss=3.74=hb5efdd6_0
  - numba=0.55.0=py39h56b8d98_0
  - numexpr=2.8.0=py39hbd72853_101
  - numpy=1.21.5=py39haac66dc_0
  - olefile=0.46=pyh9f0ad1d_1
  - openblas=0.3.18=pthreads_h4748800_0
  - openjpeg=2.4.0=hb52868f_1
  - openssl=1.1.1l=h7f98852_0
  - opentsne=0.6.1=py39h5472131_1
  - packaging=21.3=pyhd8ed1ab_0
  - pandas=1.4.0=py39hde0f152_0
  - pandoc=2.17.1.1=ha770c72_0
  - pandocfilters=1.5.0=pyhd8ed1ab_0
  - parso=0.8.3=pyhd8ed1ab_0
  - pathspec=0.9.0=pyhd8ed1ab_0
  - patsy=0.5.2=pyhd8ed1ab_0
  - pcre=8.45=h9c3ff4c_0
  - pcre2=10.37=h032f7d1_0
  - perl=5.32.1=1_h7f98852_perl5
  - pexpect=4.8.0=pyh9f0ad1d_2
  - pickleshare=0.7.5=py39hde42818_1002
  - pillow=8.2.0=py39hf95b381_1
  - pip=22.0.2=pyhd8ed1ab_0
  - pixman=0.40.0=h36c2ea0_0
  - platformdirs=2.3.0=pyhd8ed1ab_0
  - plotly=5.5.0=pyhd8ed1ab_0
  - prometheus_client=0.13.1=pyhd8ed1ab_0
  - prompt-toolkit=3.0.26=pyha770c72_0
  - prompt_toolkit=3.0.26=hd8ed1ab_0
  - pthread-stubs=0.4=h36c2ea0_1001
  - ptyprocess=0.7.0=pyhd3deb0d_0
  - pure_eval=0.2.2=pyhd8ed1ab_0
  - pybedtools=0.9.0=py39h9a82719_0
  - pycparser=2.21=pyhd8ed1ab_0
  - pygments=2.11.2=pyhd8ed1ab_0
  - pynndescent=0.5.6=pyh6c4a22f_0
  - pyopenssl=22.0.0=pyhd8ed1ab_0
  - pyparsing=3.0.7=pyhd8ed1ab_0
  - pyqt=5.12.3=py39hf3d152e_8
  - pyqt-impl=5.12.3=py39hde8b62d_8
  - pyqt5-sip=4.19.18=py39he80948d_8
  - pyqtchart=5.12=py39h0fcd23e_8
  - pyqtwebengine=5.12.1=py39h0fcd23e_8
  - pyrsistent=0.18.1=py39h3811e60_0
  - pysam=0.17.0=py39h20405f9_1
  - pysocks=1.7.1=py39hf3d152e_4
  - pytables=3.7.0=py39h2669a42_0
  - python=3.9.10=h85951f9_1_cpython
  - python-dateutil=2.8.2=pyhd8ed1ab_0
  - python_abi=3.9=2_cp39
  - pytz=2021.3=pyhd8ed1ab_0
  - pyzmq=22.3.0=py39h37b5a0c_1
  - qt=5.12.9=ha98a1a1_5
  - qtconsole=5.2.2=pyhd8ed1ab_1
  - qtconsole-base=5.2.2=pyhd8ed1ab_1
  - qtpy=2.0.0=pyhd8ed1ab_0
  - readline=8.1=h46c0cb4_0
  - requests=2.27.1=pyhd8ed1ab_0
  - scanpy=1.8.2=pyhd8ed1ab_0
  - scikit-learn=1.0.2=py39h4dfa638_0
  - scipy=1.7.3=py39hee8e79c_0
  - seaborn=0.11.2=hd8ed1ab_0
  - seaborn-base=0.11.2=pyhd8ed1ab_0
  - send2trash=1.8.0=pyhd8ed1ab_0
  - setuptools=60.6.0=py39hf3d152e_0
  - sinfo=0.3.1=py_0
  - six=1.16.0=pyh6c4a22f_0
  - sniffio=1.2.0=py39hf3d152e_2
  - snowballstemmer=2.2.0=pyhd8ed1ab_0
  - sphinx=4.4.0=pyh6c4a22f_1
  - sphinxcontrib-applehelp=1.0.2=py_0
  - sphinxcontrib-devhelp=1.0.2=py_0
  - sphinxcontrib-htmlhelp=2.0.0=pyhd8ed1ab_0
  - sphinxcontrib-jsmath=1.0.1=py_0
  - sphinxcontrib-qthelp=1.0.3=py_0
  - sphinxcontrib-serializinghtml=1.1.5=pyhd8ed1ab_1
  - sqlite=3.37.0=h9cd32fc_0
  - stack_data=0.1.4=pyhd8ed1ab_0
  - statsmodels=0.13.1=py39hce5d2b2_0
  - stdlib-list=0.7.0=py_2
  - suitesparse=5.10.1=h9e50725_1
  - tbb=2021.5.0=h4bd325d_0
  - tenacity=8.0.1=pyhd8ed1ab_0
  - terminado=0.13.1=py39hf3d152e_0
  - testpath=0.5.0=pyhd8ed1ab_0
  - texlive-core=20210325=h97429d4_1
  - texttable=1.6.4=pyhd8ed1ab_0
  - threadpoolctl=3.1.0=pyh8a188c0_0
  - tk=8.6.11=h27826a3_1
  - tomli=2.0.0=pyhd8ed1ab_1
  - tornado=6.1=py39h3811e60_2
  - tqdm=4.62.3=pyhd8ed1ab_0
  - traitlets=5.1.1=pyhd8ed1ab_0
  - typed-ast=1.5.2=py39h3811e60_0
  - typing_extensions=4.0.1=pyha770c72_0
  - tzdata=2021e=he74cb21_0
  - umap-learn=0.5.2=py39hf3d152e_1
  - unicodedata2=14.0.0=py39h3811e60_0
  - urllib3=1.26.8=pyhd8ed1ab_1
  - wcwidth=0.2.5=pyh9f0ad1d_2
  - webencodings=0.5.1=py_1
  - websocket-client=1.2.3=pyhd8ed1ab_0
  - wheel=0.37.1=pyhd8ed1ab_0
  - widgetsnbextension=3.5.2=py39hf3d152e_1
  - xarray=0.21.0=pyhd8ed1ab_1
  - xorg-kbproto=1.0.7=h7f98852_1002
  - xorg-libice=1.0.10=h7f98852_0
  - xorg-libsm=1.2.3=hd9c2040_1000
  - xorg-libx11=1.7.2=h7f98852_0
  - xorg-libxau=1.0.9=h7f98852_0
  - xorg-libxdmcp=1.1.3=h7f98852_0
  - xorg-libxext=1.3.4=h7f98852_1
  - xorg-libxrender=0.9.10=h7f98852_1003
  - xorg-renderproto=0.11.1=h7f98852_1002
  - xorg-xextproto=7.3.0=h7f98852_1002
  - xorg-xproto=7.0.31=h7f98852_1007
  - xz=5.2.5=h516909a_1
  - zeromq=4.3.4=h9c3ff4c_1
  - zipp=3.7.0=pyhd8ed1ab_0
  - zlib=1.2.11=h36c2ea0_1013
  - zstd=1.5.2=ha95c52a_0
  - pip:
    - allcools==0.2.1
    - biopython==1.79
    - cloudpickle==2.0.0
    - dask==2022.1.1
    - deprecated==1.2.13
    - fsspec==2022.1.0
    - ftfy==6.0.3
    - imbalanced-learn==0.9.0
    - imblearn==0.0
    - locket==0.2.1
    - mailchecker==4.1.10
    - partd==1.2.0
    - phonenumbers==8.12.42
    - pybigwig==0.3.18
    - pychebfun==0.3
    - python-benedict==0.24.3
    - python-fsutil==0.6.0
    - python-igraph==0.9.9
    - python-slugify==5.0.2
    - pyyaml==6.0
    - tasklogger==1.1.0
    - text-unidecode==1.3
    - toml==0.10.2
    - toolz==0.11.2
    - torch==1.10.1+cpu
    - wrapt==1.13.3
    - xlrd==1.2.0
    - xmltodict==0.12.0
prefix: /opt/conda/envs/bipca-experiment
daler commented 2 years ago

Hmm. This looks fine, and I'm unable to reproduce using that yaml (I get some torch errors on the pip side, but all the conda packages install).

This is what I did:

I saved your output as env.yaml, deleted the last prefix line, and ran the following:

mamba env create --file env.yaml
conda activate bipca-experiment
python -c 'import pybedtools; a = pybedtools.example_bedtool("a.bed"); print(a.intersect(a))'

which correctly returned

chr1    1       100     feature1        0       +
chr1    100     200     feature2        0       +
chr1    150     200     feature2        0       +
chr1    150     200     feature3        0       -
chr1    150     500     feature3        0       -
chr1    900     950     feature4        0       +

Only thing I can think of is if you scatter some import os; print(os.getenv('PATH')) throughout the various packages' code living in that env to try and see what's happening, e.g. before this line.

stanleyjs commented 2 years ago

Thanks.

I wonder if the problem is in the downstream module. Maybe they monkey patched pybedtools or something.

Thanks for the test snippet. I will investigate and report back.

stanleyjs commented 2 years ago

Interesting. @daler I think the problem is that my path inside of jupyter is not mounting the conda environment that the jupyter session is launched from. import os os.getenv('PATH') returns '"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"'

while in a terminal (bipca-experiment) jay@fe01cccb4def:/$ echo $PATH

returns /root/.local/bin:/opt/conda/envs/bipca-experiment/bin:/opt/conda/condabin:/opt/conda/bin:/opt/conda/condabin:/opt/conda/bin:/usr/lib/rstudio-server/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

I'm running inside of a docker container with a complex entrypoint that launches things as root and then change users to an environment variable, so it's clearly some bug with that.

Kind of odd, as python is able to find all the packages that are living at /opt/conda..

Thanks for the pointer.

stanleyjs commented 2 years ago

It's unrelated to pybedtools, but for posterity: the problem was essentially caused by the way that jupyter is being launched: it's being launched by a docker multi service manager s6-overlay that launches jupyter as the environmental user and UID, but that doesn't read from .bashrc or any of the normal sources for paths, only /etc/environment for whatever reasons. Adding a manual edit to /etc/environment in the docker file fixed the problem, because it made jupyter see the full path that the users have, and thus pybedtools could see bedtools.

daler commented 2 years ago

Thanks for reporting back, glad you figured it out!