Open gianfilippo opened 6 months ago
Dear Gianfilippo,
Thank you for your interest in MetacellAnalysisToolkit.
Using the following command lines, we obtained no error:
singularity pull docker://agabriel/matk:v1.0
singularity run --bind $(pwd) matk_v1.0.sif MATK -t SuperCell -i data/cd34_multiome_rna.h5ad -o MATK_output/SuperCell/cd34/ -n 50 -f 2000 -k 30 -g 75 -s seurat
We suspect that we are not using the same file as input. Unfortunately, the link to download the cd34_multiome_rna.h5ad initially provided in our README seems to be corrupted at the moment. We apologize for the inconvenience and have updated the link. Could you please download again the data and check that you have the following md5: 4cd8d82adfe267f54e13d8a383918fd0
by running: md5sum data/cd34_multiome_rna.h5ad
.
We can also run a test on another dataset. After cloning/pulling the current MetacellAnalysisToolkit repository you can try the following command lines:
singularity pull docker://agabriel/matk:v1.0
singularity run --bind $(pwd) matk_v1.0.sif python get_data/get_PBMC_dataset.py
singularity run --bind $(pwd) matk_v1.0.sif Rscript get_data/get_PBMC_rds.R
singularity run --bind $(pwd) matk_v1.0.sif MATK -t SuperCell -i get_data/pbmc.h5ad -o MATK_output/SuperCell/pbmc/ -n 50 -f 2000 -k 30 -g 75 -s seurat
singularity run --bind $(pwd) matk_v1.0.sif MATK -t SuperCell -i get_data/pbmc.rds -o MATK_output/SuperCell/pbmc/ -n 50 -f 2000 -k 30 -g 75 -s seurat
Finally, for future usage, please note that matk:v1.0 is based on Seurat V4 and matk:v1.1 on Seurat V5. The command lines described above should run with both docker environments.
Best wishes.
Hi,
thanks, but I still get the same error. I think both the conda version and the Docker version look into my local Python path.
What do you suggest ?
Thanks
Hi,
Thanks for the feedback, I agree that it could be the case considering the error message. I am surprised though that this happens in the docker container.
Could you try to identify which python is used, running:
singularity run --bind $(pwd) matk_v1.0.sif which python
In my case, I obtain the following: /opt/conda/envs/MetacellAnalysisToolkit/bin/python
I think that singularity has a strange behaviour and mounts also the HOME directory, can you provide the output of the following command:
singularity run --bind $(pwd) matk_v1.0.sif Rscript -e "reticulate::py_config()"
and then do the same adding the --no-home option:
singularity run --no-home --bind $(pwd) matk_v1.0.sif Rscript -e "reticulate::py_config()"
I suspect that adding --no-home could solve your issue.
Note that I had to update the docker containers, please pull again the containers using singularity before running your tests.
Let me know if this helps and I will update the README accordingly.
Best wishes, Aurélie
Hi,
thanks. I am puzzled as well.
Anyway, 1) singularity run --bind $(pwd) matk_v1.0.sif which python /opt/conda/envs/MetacellAnalysisToolkit/bin/python
2) singularity run --bind $(pwd) matk_v1.0.sif Rscript -e "reticulate::py_config()" python: /home/XXX/.conda/envs/r-reticulate/bin/python libpython: /home/XXX/.conda/envs/r-reticulate/lib/libpython3.10.so pythonhome: /home/XXX/.conda/envs/r-reticulate:/home/XXX/.conda/envs/r-reticulate version: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] numpy: [NOT FOUND]
3) singularity run --no-home --bind $(pwd) matk_v1.0.sif Rscript -e "reticulate::py_config()" python: /home/XXX/.conda/envs/r-reticulate/bin/python libpython: /home/XXX/.conda/envs/r-reticulate/lib/libpython3.10.so pythonhome: /home/XXX/.conda/envs/r-reticulate:/home/XXX/.conda/envs/r-reticulate version: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] numpy: [NOT FOUND]
I should also mention that the singularity run command looks into my R_LIBS_USER even if I add the "--no-home" flag and exits with a different error code early in the process Error: package or namespace load failed for ‘Seurat’ in dyn.load(file, DLLpath = DLLpath, ...):
If I unset R_LIBS_USER, then I am back to the error I reported, also using the "--no-home" flag.
The python path seems to be correct, so I do not understand why i am getting the error.
What do you think ?
Best
Hi,
Sorry for the delay, I was unavailable during the past week.
Could you please provide the output of the following command:
singularity run --no-home --cleanenv --bind $(pwd) matk_v1.0.sif Rscript -e "reticulate::py_config(); .libPaths()"
And then:
singularity run --no-home --cleanenv --env R_LIBS_USER=/opt/conda/envs/MetacellAnalysisToolkit/lib/R/library --bind $(pwd) matk_v1.0.sif Rscript -e "reticulate::py_config(); .libPaths()"
Additionally, could you let me know the version of Singularity you are using? I would like to try to reproduce the error.
Finally, something that could help us debug would be to check the environment variables:
singularity exec --no-home --cleanenv --env R_LIBS_USER=/opt/conda/envs/MetacellAnalysisToolkit/lib/R/library --bind $(pwd) matk_v1.0.sif env
Best,
Aurélie
Hi,
thanks for looking into this! The output of the first command: python: /opt/conda/envs/MetacellAnalysisToolkit/bin/python3 libpython: /opt/conda/envs/MetacellAnalysisToolkit/lib/libpython3.9.so pythonhome: /opt/conda/envs/MetacellAnalysisToolkit:/opt/conda/envs/MetacellAnalysisToolkit version: 3.9.16 | packaged by conda-forge | (main, Feb 1 2023, 21:39:03) [GCC 11.3.0] numpy: /opt/conda/envs/MetacellAnalysisToolkit/lib/python3.9/site-packages/numpy numpy_version: 1.24.4
NOTE: Python version was forced by PATH
python versions found: /opt/conda/envs/MetacellAnalysisToolkit/bin/python3 /opt/conda/envs/MetacellAnalysisToolkit/bin/python [1] "/opt/conda/envs/MetacellAnalysisToolkit/lib/R/library"
I am really using apptainer 1.2.5-1.el8
The output from the second command: APPTAINER_APPNAME= APPTAINER_BIND=/home/$USERID/scripts/MetacellAnalysisToolkit APPTAINER_COMMAND=exec APPTAINER_CONTAINER=/home/$USERID/scripts/MetacellAnalysisToolkit/matk_v1.0.sif APPTAINER_ENVIRONMENT=/.singularity.d/env/91-environment.sh APPTAINER_NAME=matk_v1.0.sif HOME=/home/$USERID LANG=C.UTF-8 LC_ALL=C.UTF-8 LD_LIBRARY_PATH=/.singularity.d/libs PATH=/opt/conda/envs/MetacellAnalysisToolkit/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/MetacellAnalysisToolkit/cli/ PROMPT_COMMAND=PS1="Apptainer> "; unset PROMPT_COMMAND PS1=Apptainer> PWD=/gpfs/ycga/pi/coppola/SamKatz/scripts/MetacellAnalysisToolkit R_LIBS_USER=/opt/conda/envs/MetacellAnalysisToolkit/lib/R/library SINGULARITY_BIND=/home/$USERID/scripts/MetacellAnalysisToolkit SINGULARITY_CONTAINER=/home/$USERID/scripts/MetacellAnalysisToolkit/matk_v1.0.sif SINGULARITY_ENVIRONMENT=/.singularity.d/env/91-environment.sh SINGULARITY_NAME=matk_v1.0.sif TERM=xterm-256color
Best Gianfilippo
Hello,
Based on these outputs, to me it seems that the paths inside the container are correct with --cleanenv, what is your error running the matk command including the cleanenv option?
singularity run --no-home --cleanenv --bind $(pwd) matk_v1.0.sif MATK -t SuperCell -i data/cd34_multiome_rna.h5ad -o MATK_output/SuperCell/cd34/ -n 50 -f 2000 -k 30 -g 75 -s seurat
or
singularity run --no-home --cleanenv --env R_LIBS_USER=/opt/conda/envs/MetacellAnalysisToolkit/lib/R/library --bind $(pwd) matk_v1.0.sif MATK -t SuperCell -i data/cd34_multiome_rna.h5ad -o MATK_output/SuperCell/cd34/ -n 50 -f 2000 -k 30 -g 75 -s seurat
I had some issues installing your version of apptainer, I will come back to you when it is solved. Best, Aurélie
Hi,
sorry about the delay.
I just tried the last two commands and the runs completed without errors.
I did not change anything on the cluster or my account settings. So I do not know why things are working now. I will try with my data and see.
Thanks
Just want to add for people (not sure if this could be causing any of the problems in this thread) that Seurat files should be v3/4 and file name should end in .rds not .RDS (case sensitive, sometimes people use all caps for file suffix). Great tool! It's working really well for me.
Hi,
I tried running on my data and on the example data.
SuperCell seems to work, but if I try SEACells or MetaCell, I get the following error
File "$HOME/bin/MetacellAnalysisToolkit/cli/MetaCell2CL.py", line 312, in
I also get a warning before: WARNING: The R package "reticulate" only fixed recently an issue that caused a segfault when used with rpy2: https://github.com/rstudio/reticulate/pull/1188 Make sure that you use a version of that package that includes the fix.
I did install the latest reticulate, but the error persists
What do you think ?
Hi @gianfilippo,
Thank you for your feedbacks. If I understand correctly you also have this error on the example data when using SEACells and MetaCell, could you please provide us the command line which led to this error?
Also, it seems that the error occurs when running readRDS(input_file)
, could you give us more information on how you built the input file?
Note that for SEACell, we recently fixed an issue that was arising when a seurat object without pca embedding was provided as input. If you are in this configuration, please make sure to pull the last changes of the github repo and if needed pull again the docker containers (container with SeuratV5: agabriel/matk:SeuratV5
and container with SeuratV4: agabriel/matk:SeuratV4
).
Best,
Hi,
I tried it again, without making any changes and it works now. I do get some warnings with MetaCell, but it seems ok. Then problem with the test data was the wrong input file. The problem with my own data is unclear, as I did not change anything. I should probably just take a break :)
Thanks again for your input.
Best
Hi,
I tried both the conda and docker (using singularity) versions I run the following MATK -t SuperCell -i data/cd34_multiome_rna.h5ad -o MATK_output/SuperCell/cd34/ -n 50 -f 2000 -k 30 -g 75 -s seurat and singularity run --bind $(pwd) matk_v1.0.sif MATK -t SuperCell -i data/cd34_multiome_rna.h5ad -o MATK_output/SuperCell/cd34/ -n 50 -f 2000 -k 30 -g 75 -s seurat
and I get the error below. Can you please help ?
Thanks
Error in py_module_import(module, convert = convert) : ModuleNotFoundError: No module named 'anndata' Run -> -> py_module_import
Execution halted
reticulate::py_last_error()
for details. Calls: