aidenlab / straw

Extract data quickly from Juicebox via straw
MIT License
61 stars 36 forks source link

Straw takes too much time to run locally in comparison to Collab notebook. #88

Closed BlackPianoCat closed 2 years ago

BlackPianoCat commented 2 years ago

I installed straw locally to my python 3.9 environment via the command that you propose to your site with pip install hic-straw and pip install strawC. Then I also installed pybind 11 from the [Anaconda repository]https://anaconda.org/conda-forge/pybind11). Unfortunately in your collab tutorial the next commands,

import straw
import numpy as np
from scipy.sparse import coo_matrix

result = straw.straw('observed','KR', 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined_30.hic', '4:0:1000000', '4:0:1000000', 'BP', 5000)

result = straw.straw('observed','KR', 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined_30.hic', '4:1000000:2000000', '4:1000000:2000000', 'BP', 5000)

# printing the first 10 rows from the sparse format
for i in range(10):
  print("{0}\t{1}\t{2}".format(result[i].binX, result[i].binY, result[i].counts))

take about 30 sec to run, whereas locally I wait for more than 15 minutes and I see no result. Do you know where is the problem?

And for some reason pybind11 cannot be installed to python 3.9 via the way you describe in your site (probably this is the problem).

Thank you!

sa501428 commented 2 years ago

It does seem like older pybind11 versions and python 3.9 have some incompatibility issues: https://pybind11.readthedocs.io/en/stable/limitations.html

Do you also see this issue above with python 3.8?

BlackPianoCat commented 2 years ago

Unfortunately, I cannot install via pip install hic-straw at all on python 3.8.0. I have this long text as error:

(stripes) blackpianocat@team-gimli:~/Dropbox/loop extrusion/stripes/straw-master$ pip install hic-straw
Collecting hic-straw
  Using cached hic-straw-1.0.0.1.tar.gz (15 kB)
  Preparing metadata (setup.py) ... done
Collecting pybind11>=2.4
  Using cached pybind11-2.9.0-py2.py3-none-any.whl (210 kB)
Building wheels for collected packages: hic-straw
  Building wheel for hic-straw (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /home/blackpianocat/anaconda3/envs/stripes/bin/python3.8 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/setup.py'"'"'; __file__='"'"'/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-j37yrfxw
       cwd: /tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/
  Complete output (20 lines):
  /home/blackpianocat/anaconda3/envs/stripes/lib/python3.8/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
    warnings.warn(
  running bdist_wheel
  running build
  running build_ext
  creating tmp
  gcc -pthread -B /home/blackpianocat/anaconda3/envs/stripes/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/blackpianocat/anaconda3/envs/stripes/include/python3.8 -c /tmp/tmp9m2erjx3.cpp -o tmp/tmp9m2erjx3.o -std=c++14
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  gcc -pthread -B /home/blackpianocat/anaconda3/envs/stripes/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/blackpianocat/anaconda3/envs/stripes/include/python3.8 -c /tmp/tmpdye2s9d6.cpp -o tmp/tmpdye2s9d6.o -fvisibility=hidden
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  creating build
  creating build/temp.linux-x86_64-3.8
  creating build/temp.linux-x86_64-3.8/src
  gcc -pthread -B /home/blackpianocat/anaconda3/envs/stripes/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/.eggs/pybind11-2.9.0-py3.8.egg/pybind11/include -I/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/.eggs/pybind11-2.9.0-py3.8.egg/pybind11/include -Isrc -I/home/blackpianocat/anaconda3/envs/stripes/include/python3.8 -c src/straw.cpp -o build/temp.linux-x86_64-3.8/src/straw.o -DVERSION_INFO=\"1.0.0.1\" -std=c++14 -fvisibility=hidden
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  src/straw.cpp:33:10: fatal error: curl/curl.h: No such file or directory
     33 | #include <curl/curl.h>
        |          ^~~~~~~~~~~~~
  compilation terminated.
  error: command '/usr/bin/gcc' failed with exit code 1
  ----------------------------------------
  ERROR: Failed building wheel for hic-straw
  Running setup.py clean for hic-straw
Failed to build hic-straw
Installing collected packages: pybind11, hic-straw
    Running setup.py install for hic-straw ... error
    ERROR: Command errored out with exit status 1:
     command: /home/blackpianocat/anaconda3/envs/stripes/bin/python3.8 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/setup.py'"'"'; __file__='"'"'/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-c85n5job/install-record.txt --single-version-externally-managed --compile --install-headers /home/blackpianocat/anaconda3/envs/stripes/include/python3.8/hic-straw
         cwd: /tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/
    Complete output (19 lines):
    running install
    /home/blackpianocat/anaconda3/envs/stripes/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
      warnings.warn(
    running build
    running build_ext
    gcc -pthread -B /home/blackpianocat/anaconda3/envs/stripes/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/blackpianocat/anaconda3/envs/stripes/include/python3.8 -c /tmp/tmpp8r189wy.cpp -o tmp/tmpp8r189wy.o -std=c++14
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    gcc -pthread -B /home/blackpianocat/anaconda3/envs/stripes/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/blackpianocat/anaconda3/envs/stripes/include/python3.8 -c /tmp/tmpv7ygvt37.cpp -o tmp/tmpv7ygvt37.o -fvisibility=hidden
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    creating build
    creating build/temp.linux-x86_64-3.8
    creating build/temp.linux-x86_64-3.8/src
    gcc -pthread -B /home/blackpianocat/anaconda3/envs/stripes/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/blackpianocat/anaconda3/envs/stripes/lib/python3.8/site-packages/pybind11/include -I/home/blackpianocat/anaconda3/envs/stripes/lib/python3.8/site-packages/pybind11/include -Isrc -I/home/blackpianocat/anaconda3/envs/stripes/include/python3.8 -c src/straw.cpp -o build/temp.linux-x86_64-3.8/src/straw.o -DVERSION_INFO=\"1.0.0.1\" -std=c++14 -fvisibility=hidden
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    src/straw.cpp:33:10: fatal error: curl/curl.h: No such file or directory
       33 | #include <curl/curl.h>
          |          ^~~~~~~~~~~~~
    compilation terminated.
    error: command '/usr/bin/gcc' failed with exit code 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /home/blackpianocat/anaconda3/envs/stripes/bin/python3.8 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/setup.py'"'"'; __file__='"'"'/tmp/pip-install-3gfpxurl/hic-straw_932f4cfa27aa46ecb6a04ff979c5a168/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-c85n5job/install-record.txt --single-version-externally-managed --compile --install-headers /home/blackpianocat/anaconda3/envs/stripes/include/python3.8/hic-straw Check the logs for full command output.

I also ran these commands (which are the common solution of the problem) and still the same error:

sudo apt-get install manpages-dev
sudo apt-get install gcc python-dev
sudo apt-get install gcc python3-dev
sudo apt-get install python3-dev
sudo apt install build-essential
sudo apt install libxslt-dev libffi-dev libssl-dev
BlackPianoCat commented 2 years ago

Ok, so I finally found a solution. Installation in python 3.9.9, with

 python3 -m pip install hic-straw

instead of

pip install hic-straw

that it is recommended in documentation. I do not know why, but with the recommended one it runs in 20 mins, wth python3 -m it runs in 2-3 mins, whereas in collab runs in 30 secs. Maybe internet connection also plays some role. I don't know if it is the command or the version of python that makes it faster (I am not the best programmer).