Error: from bayestme import deconvolution

Jihua-Liu commented 1 year ago

Hi, I installed the package using SuiteSparse. When I try the run example you provide, I could not import deconvolution. Currently I'm using numpy Version: 1.22.0 I attached the error message below. Could you please take a look? Thanks!

jeffquinn-msk commented 1 year ago

Hi Jihua-Liu,

Thanks for taking the time to file an issue.

Unfortunately one of the packages we rely on scikit-sparse (https://github.com/scikit-sparse/scikit-sparse) is difficult to install on OSX. The error message you are showing me indicates that it did not get linked to the correct binaries during installation. You'll probably need to use conda to install it. I have a sample conda environment file checked into the repo here, which is what I personally use for local development: https://github.com/tansey-lab/bayestme/blob/main/bayestme.conda.yml

This is why we recommend (in our docs) to run the pipeline inside of our docker container, which is hosted here: https://hub.docker.com/repository/docker/jeffquinnmsk/bayestme/general

This way you will be guaranteed to have everything working cleanly.

Jihua-Liu commented 1 year ago

Thanks for your reply! Because I'm using pyenv, can I use conda to re-install the scikit-sparse?

I tried to install bayestme using the docker container. But because I never used docker before, I don't know how to library bayestme from docker in .ipynb file. Or do I have to upload my data to docker to perform the analysis?

Thanks!

jeffquinn-msk commented 1 year ago

Hello,

Sorry I'm not familiar with pyenv so I cannot comment on that.

I would strongly recommend not trying to install this package unless you want to create an environment for developing (aka contributing to the project). Using the docker image will be much easier.

Here's an example of what you might run in your terminal (after you install docker desktop on your machine):

docker run -v <path to your data>:/data -v <path to output directory>:/output jeffquinnmsk/bayestme:latest <bayestme command>

Replace <path to your data> and <path to output directory> with the absolute path to where your data is on the filesystem and where you want the output to go on your filesystem

Replace <bayestme command> with the appropriate commands and flags from the example CLI workflow https://bayestme.readthedocs.io/en/latest/example_workflow.html

-v flag will allow the process running in the container to access the specified directories on your local filesystem.

For example, with variables filled in, I might run the following steps:

docker run -v /Users/jeff/spacranger_data:/data -v /Users/jeff/bayestme_results:/output \
    jeffquinnmsk/bayestme:latest \
    load_spaceranger \
    --input /data/sample_1 \
    --output /output/dataset.h5ad
docker run -v /Users/jeff/spacranger_data:/data -v /Users/jeff/bayestme_results:/output \
    jeffquinnmsk/bayestme:latest filter_genes \
    --adata /output/dataset.h5ad \
    --filter-ribosomal-genes \
    --n-top-by-standard-deviation 1000 \
    --output /output/dataset_filtered.h5ad
etc..

There are many many guides for using docker available online.

Now even EASIER than the above is to use our new nextflow (https://www.nextflow.io/docs/latest/channel.html) workflow. It will run all the steps of bayestme in one go, in docker, as opposed to having to execute multiple commands in sequence.

nextflow run https://github.com/tansey-lab/bayestme -r main -params-file '<path to params yaml>'

The params you need to define in your params yaml file are documented here: https://github.com/tansey-lab/bayestme/blob/main/nextflow/nextflow.config

I hope one of these options is suitable for you, let me know how it goes.

Jihua-Liu commented 1 year ago

Thank you so much for your detailed instruction! This is really helpful and I'll test it out.

Would you be able to give any guidance about using docker with Jupiter notebook? Like the Google Colab example you provided, I want to use bayestme in Jupiter Notebook rather than using the terminals. This requires me to import bayestme.

I already install the docker desktop and pulled bayestme there. I wonder is there anyway that I can use this in python rather than Command Line Interface?

Thank you!

Jihua-Liu commented 1 year ago

Sorry for another comment. I tried to use the absolute path of my input file (10X visium data) and this error message pops up. Could you please take a look? Thank you so much!

jeffquinn-msk commented 1 year ago

This error is happening because it cant read a file it expected to be in that folder. Which is happening because you supplied a folder path that does not exist.

Assuming a folder named A1_318/outs is inside /Users/jihua/Desktop/Methodolist/projects/ST/data your --input argument should be /data/A1_318/outs or something (it should definitely start with /data, since thats the path you mounted your input directory to).

Would you be able to give any guidance about using docker with Jupiter notebook? Like the Google Colab example you provided, I want to use bayestme in Jupiter Notebook rather than using the terminals. This requires me to import bayestme.

Its very simple (and there are many guides for this online as well), just start the container with the docker port forwarding turned on for whatever port jupyter uses, I think its 8888, so you would add -p 8888 flag in addition to those -v flags I showed you.

Your command would look like this:

docker run -p 8888 -it -v /Users/jeff/spacranger_data:/data -v /Users/jeff/bayestme_results:/output \
    jeffquinnmsk/bayestme:latest /bin/bash

This will let you "SSH" into the container in a sense, as we will just run bash inside the container.

Inside this "SSH" session just run pip install jupyter inside the container, then run the jupyter-notebook command which will start the server inside the container, and you will be able to connect to jupyter server on your computer at localhost:8888

Another note: if you intend to run the entire BayesTME pipline on your laptop, it might not be a good idea. In it's current form it is a rather computationally intensive method, with a high memory footprint. We are shipping a new version soon that will be less so, but its not ready yet. I would recommend running on your institution's HPC if you have one. You can play around with it locally though and maybe run some of the steps to a low sample number.

jeffquinn-msk commented 1 year ago

Another note: The output anndata archive bayestme produces contains all the results of the analysis, so you can play around with them in your own jupyter environment after the fact. Theres probably no need to do interactive analysis inside the container. Just run the methods and then play around with the results in whatever environment you choose. All you need is to install the anndata library to read the resulting h5ad archive. BayesTME also produces high quality plots for everything it does as part of the pipeline, so you can look at those as well. This was my intention with the design of this anyway, let me know if it makes sense.

Jihua-Liu commented 1 year ago

It works! Thank you so much for your help :) I'm looking forward to the new version and good luck with it!

jeffquinn-msk commented 1 year ago

Awesome! No problem, let us know if you need any other help with your analysis, we're grateful to have people trying out our method on their data.

tansey-lab / bayestme

Error: from bayestme import deconvolution #107