Open lucygarner opened 4 years ago
Have you tried some of the suggestions in #163 ?
Thank you - yes I have tried using dask=1.0.0
, distributed >=1.21.6, <2.0.0
and pandas 0.25.3
, but this did not work.
I also tried to install dask 2.11.0
instead, but I got the following errors:
ERROR: pyscenic 0.10.1 has requirement dask==1.0.0, but you'll have dask 2.11.0 which is incompatible. ERROR: pyscenic 0.10.1 has requirement distributed<2.0.0,>=1.21.6, but you'll have distributed 2.11.0 which is incompatible. ERROR: pyscenic 0.10.1 has requirement pandas<1.0.0,>=0.20.1, but you'll have pandas 1.0.4 which is incompatible.
Which versions would you recommend trying?
I am using Conda environments, so if you have a .yml file for a Conda environment where pyscenic is working, I could give that a try
If none of the dask version tweaks have worked for you, I would then try using the arboreto_with_multiprocessing.py
script described in that post.
I suggest removing dask from pyscenic. It brings much more difficulties than efficiency boosts.
Thank you @cflerin, I have tried using the arboreto_with_multiprocessing.py
script. I am now getting the following error with both approaches (pyscenic grn
or arboreto_with_multiprocessing.py
)
2020-06-04 20:37:40,351 - pyscenic.cli.pyscenic - INFO - Writing results to file. preparing dask client parsing input creating dask graph 24 partitions computing dask graph not shutting down client, client was created externally finished Traceback (most recent call last): File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/bin/pyscenic", line 8, in
sys.exit(main()) File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.7/site-packages/pyscenic/cli/pyscenic.py", line 421, in main args.func(args) File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.7/site-packages/pyscenic/cli/pyscenic.py", line 80, in find_adjacencies_command extension = PurePath(fname).suffixes NameError: name 'fname' is not defined distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-14, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-19, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-9, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-15, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-20, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-3, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-10, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-21, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-22, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-16, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-6, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-13, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-11, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-12, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-17, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-23, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-4, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-5, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-18, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-7, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-24, started daemon)> distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-8, started daemon)> distributed.nanny - WARNING - Worker process 125161 was killed by unknown signal /data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.7/site-packages/dask/config.py:161: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. data = yaml.load(f.read()) or {}
I am using a Conda environment containing the following packages:
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_llvm conda-forge
arboreto 0.1.5 pypi_0 pypi
attrs 19.3.0 pypi_0 pypi
bokeh 2.0.1 py37hc8dfbb8_0 conda-forge
boltons 20.1.0 pypi_0 pypi
ca-certificates 2020.4.5.1 hecc5488_0 conda-forge
certifi 2020.4.5.1 py37hc8dfbb8_0 conda-forge
click 7.1.2 pyh9f0ad1d_0 conda-forge
cloudpickle 1.4.1 py_0 conda-forge
cytoolz 0.10.1 py37h516909a_0 conda-forge
dask 1.0.0 py_1 conda-forge
dask-core 1.0.0 py_0 conda-forge
decorator 4.4.2 pypi_0 pypi
dill 0.3.1.1 pypi_0 pypi
distributed 1.28.1 py37_0 conda-forge
freetype 2.10.2 he06d7ca_0 conda-forge
frozendict 1.2 pypi_0 pypi
h5py 2.10.0 pypi_0 pypi
heapdict 1.0.1 py_0 conda-forge
interlap 0.2.6 pypi_0 pypi
jinja2 2.11.2 pyh9f0ad1d_0 conda-forge
joblib 0.15.1 pypi_0 pypi
jpeg 9d h516909a_0 conda-forge
ld_impl_linux-64 2.34 h53a641e_4 conda-forge
libblas 3.8.0 16_openblas conda-forge
libcblas 3.8.0 16_openblas conda-forge
libffi 3.2.1 he1b5a44_1007 conda-forge
libgcc-ng 9.2.0 h24d8f2e_2 conda-forge
libgfortran-ng 7.5.0 hdf63c60_6 conda-forge
liblapack 3.8.0 16_openblas conda-forge
libopenblas 0.3.9 h5ec1e0e_0 conda-forge
libpng 1.6.37 hed695b0_1 conda-forge
libstdcxx-ng 9.2.0 hdf63c60_2 conda-forge
libtiff 4.1.0 hc7e4089_6 conda-forge
libwebp-base 1.1.0 h516909a_3 conda-forge
llvm-openmp 10.0.0 hc9558a2_0 conda-forge
llvmlite 0.32.1 pypi_0 pypi
locket 0.2.0 py_2 conda-forge
loompy 3.0.6 pypi_0 pypi
lz4-c 1.9.2 he1b5a44_1 conda-forge
markupsafe 1.1.1 py37h8f50634_1 conda-forge
msgpack-python 0.6.2 py37hc9558a2_0 conda-forge
multiprocessing-on-dill 3.5.0a4 pypi_0 pypi
ncurses 6.1 hf484d3e_1002 conda-forge
networkx 2.4 pypi_0 pypi
numba 0.49.1 pypi_0 pypi
numpy 1.18.4 py37h8960a57_0 conda-forge
numpy-groupies 0+unknown pypi_0 pypi
olefile 0.46 py_0 conda-forge
openssl 1.1.1g h516909a_0 conda-forge
packaging 20.4 pyh9f0ad1d_0 conda-forge
pandas 0.25.3 py37hb3f55d8_0 conda-forge
partd 1.1.0 py_0 conda-forge
pillow 7.1.2 py37h718be6c_0 conda-forge
pip 20.1.1 py_1 conda-forge
psutil 5.7.0 py37h8f50634_1 conda-forge
pyarrow 0.16.0 pypi_0 pypi
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyscenic 0.10.1 pypi_0 pypi
python 3.7.6 cpython_h8356626_6 conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python_abi 3.7 1_cp37m conda-forge
pytz 2020.1 pyh9f0ad1d_0 conda-forge
pyyaml 5.3.1 py37h8f50634_0 conda-forge
readline 8.0 hf8c457e_0 conda-forge
scikit-learn 0.23.1 pypi_0 pypi
scipy 1.4.1 pypi_0 pypi
setuptools 47.1.1 py37hc8dfbb8_0 conda-forge
six 1.15.0 pyh9f0ad1d_0 conda-forge
sortedcontainers 2.1.0 py_0 conda-forge
sqlite 3.30.1 hcee41ef_0 conda-forge
tbb 2020.0.133 pypi_0 pypi
tblib 1.6.0 py_0 conda-forge
threadpoolctl 2.1.0 pypi_0 pypi
tk 8.6.10 hed695b0_0 conda-forge
toolz 0.10.0 py_0 conda-forge
tornado 6.0.4 py37h8f50634_1 conda-forge
tqdm 4.46.1 pypi_0 pypi
typing_extensions 3.7.4.2 py_0 conda-forge
umap-learn 0.4.3 pypi_0 pypi
wheel 0.34.2 py_1 conda-forge
xz 5.2.5 h516909a_0 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
zict 2.0.0 py_0 conda-forge
zlib 1.2.11 h516909a_1006 conda-forge
zstd 1.4.4 h6597ccf_3 conda-forge
What is this issue related to?
Best, Lucy
Hi @lc822 ,
First, this:
File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.7/site-packages/pyscenic/cli/pyscenic.py", line 80, in find_adjacencies_command
extension = PurePath(fname).suffixes
NameError: name 'fname' is not defined
is a bug in pyscenic grn
, which has now been fixed in release 0.10.2.
Second, the output you pasted was not from the arboreto_with_multiprocessing.py
script. I think you'll have the most luck with this method, so if you can share the command you're running and error you're getting, it would help a lot.
Thank you - I will update to 0.10.2.
This is my command for arboreto_with_multiprocessing.py
- can you see anything wrong with this?
python arboreto_with_multiprocessing.py data/merged_all_analysed.loom resources/tfs_list/lambert2018.txt --output results/adjacencies.csv --num_workers 20
python arboreto_with_multiprocessing.py data/merged_all_analysed.loom resources/tfs_list/lambert2018.txt --output results/adjacencies.csv --num_workers 20
Looks good!
Would you mind describing how to get the arboreto_with_multiprocessing script?
How do I download it and where should it be stored on my laptop? I have version 0.10.2, but the script doesn't look like it's in my pyscenic package. Also, once I have it, can I import it in a Jupyter notebook?
Hi @Annika18,
I couldn't find the script within the package either, so I just copied it from the GitHub page. It should then be possible to run the script from your Jupyter Notebook.
Best, Lucy
Hi @Annika18 ,
You can download the script with wget:
wget https://raw.githubusercontent.com/aertslab/pySCENIC/master/scripts/arboreto_with_multiprocessing.py
wget https://raw.githubusercontent.com/aertslab/pySCENIC/master/src/pyscenic/cli/arboreto_with_multiprocessing.py
You can store it anywhere, but you'll need to have pySCENIC installed to use it. I would recommend running it directly from the command line, then importing the output into a notebook.
Thank you. Do you have any advice on picking num_workers? My computer has 4 cores, should I use 4? Why is 20 used in the example in the FAQ?
Hi @Annika18,
This is likely because most people will be running pySCENIC on a high performance computing cluster. If you only have 4 cores available, then you should go with that.
Hi,
wget https://raw.githubusercontent.com/aertslab/pySCENIC/master/scripts/arboreto_with_multiprocessing.py
I see that this script is no more available at scripts folder.
Hello,
Why isn't the arboreto_with_multiprocessing.py
available anymore? I had exactly the same issue in 3 different machines (seriously guys, Dask sucks).
Hi @davisidarta @saeedfc I reuploaded arboreto_with_multiprocessing.py here: https://github.com/Annika18/arboreto-multi-reupload/blob/master/arboreto_with_multiprocessing.py If you would like to use that to download it. All credits go to aertslab for developing the code but I just want to help out because I also couldn't use the dask implementation.
Hey @saeedfc , @davisidarta ,
Sorry about that, I moved the script to a different folder because it's now built into the pySCENIC CLI. So if you have pySCENIC 0.10.3
or higher, you don't need to download it anymore, it's available on the path to run directly. New location is here: https://raw.githubusercontent.com/aertslab/pySCENIC/master/src/pyscenic/cli/arboreto_with_multiprocessing.py
and I'll edit the post above.
Hi @Annika18 @cflerin !
Thank you. The aboreto_with_multiprocessing.py
script worked perfectly, although I had to perform downstream analysis with the CLI and deal with the loom file in scanpy (for some reason the notebooks didn't work on my local machine). Yet, I was able to export the results to SCope. You guys rock!
Describe the bug Error when running pyscenic grn. I am using an older version of dask (1.0.0) as previously suggested.
Steps to reproduce the behavior
Command run when the error occurred:
Error encountered:
Expected behavior Expected pyscenic grn to produce the output file: filtered_adjacencies.csv
Please complete the following information: