BodenmillerGroup / steinbock

A toolkit for processing multiplexed tissue images
https://bodenmillergroup.github.io/steinbock
MIT License
49 stars 14 forks source link

Segmentation fault in python package #192

Open ynanli opened 1 year ago

ynanli commented 1 year ago

Hi,

I am trying to use steinbock in HPC, which is installed in a python 3.8 conda environment. The steinbock and the dependencies are installed through pip and the requirement.txt provided. https://github.com/BodenmillerGroup/steinbock/blob/main/requirements.txt

However, I am somehow stuck in the segmentation step with no mask images generated.

(steinbock) [rec82ces@hilbert214 rec82ces]$ steinbock segment deepcell --minmax
2023-05-27 11:52:55.095298: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/rec82ces/.local/lib/python3.8/site-packages/cv2/../../lib64:/software/conda/3//lib:/lib64:/usr/lib64:/usr/local/lib64:/usr/X11R6/lib64
2023-05-27 11:52:55.095337: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Segmentation fault
(steinbock) [rec82ces@hilbert214 rec82ces]$ ls
images.csv  img  masks  panel.csv  raw
(steinbock) [rec82ces@hilbert214 rec82ces]$ ls masks
(steinbock) [rec82ces@hilbert214 rec82ces]$ ls img
Fibro_SSc194994_20220207_Aleix_001.tiff

May I ask what might be the problem? Many thanks!!

Milad4849 commented 1 year ago

Hi ynanli,

Your error has to do with TensorFlow and configuration of gpu computation on your HPC, see here . To confirm this, you can try segmentation via ilastik/cellprofiler or cellpose in steinbock (see here) , these do not require TensorFlow and should run without an issue.

ynanli commented 1 year ago

Hi @Milad4849 ,

Thanks a lot for the suggestion. We now tried to resolve the error by doing it in a GPU node and loading CUDA packages. There are no more errors popping up. However, it can still not perform the deepcell segmentation.

(steinbock) [rec82ces@hilbert300 rec82ces]$ steinbock --version
steinbock, version 0.16.1
(steinbock) [rec82ces@hilbert300 rec82ces]$ steinbock segment deepcell --minmax
Segmentation fault (core dumped)

The ilastik/cellprofiler seems to work, as it generated a cell profiler pipeline file. But I have not tried to do the segmentation yet.

The cellpose does not work. As soon as the cellpose is installed, the steinbock does not work at all.

  Successfully installed cellpose-2.2.2
(steinbock) [rec82ces@hilbert300 rec82ces]$ steinbock segment deepcell 
Segmentation fault (core dumped)
(steinbock) [rec82ces@hilbert300 rec82ces]$ steinbock --version
Segmentation fault (core dumped)
(steinbock) [rec82ces@hilbert300 rec82ces]$ pip uninstall cellpose
  Successfully uninstalled cellpose-2.2.2
(steinbock) [rec82ces@hilbert300 rec82ces]$ steinbock --version
steinbock, version 0.16.0

Other than segmentation, the rest of the steinbock functions seem to work quite well, e.g., measuring intensities/neighbors and exporting csv.

jwindhager commented 1 year ago

Hi @ynanli,

I assume you are installing steinbock as a Python package, instead of using the steinbock Docker container. If you do so, you need to make sure that deepcell and tensorflow packages (and potentially the GPU driver/CUDA library versions and the GPU) are compatible.

It is likely that the original error you observed was because your tensorflow package was linked against CUDA (typical on cluster environments with GPU support), but you didn't load the CUDA module on the machine. This you correctly resolved by loading the CUDA module.

The current error likely appears because of package version incompatibilities in your environment. Could you please let us know what versions of tensorflow (pip list) and CUDA are installed/loaded, and what GPU model you are using?

Another way this might go wrong is that by loading the CUDA module on your cluster, you implicitly load some tensorflow module "over" the tensorflow package installed in your environment, causing incompatibilities.

It may be a good idea to involve your system administrator at this point. Alternatively, you could use the steinbock Docker container (GPU-enabled or not), or try running steinbock with Singularity (if your cluster does not support docker; undocumented & untested, but should work).

ynanli commented 1 year ago

Hi @jwindhager,

Thanks for the information. Here I provide the package versions here:

(steinbock) [rec82ces@hpc-storage-14k-1 rec82ces]$ pip list | grep tensorflow
tensorflow                    2.8.4
tensorflow-addons             0.16.1
tensorflow-estimator          2.8.0
tensorflow-io-gcs-filesystem  0.32.0
[rec82ces@hpc-storage-14k-1 rec82ces]$ module load CUDA/11.4.3

  CUDA Toolkit 11.4.3

The GPU model is Nvidia GTX 1080 Ti. I think we might have pinpointed the problem with our system administrator. Due to our firewall policy, the cluster is not connected to the internet, so we cannot download the DeepCell model from the Amazon cloud. However, I heard it might be possible to download the file manually? Do you, by any chance, have some experience with it?

You got us. The Docker is not supported in our cluster. Thanks for the suggestion with Singularity. Do you think it would work by installing with the Singularity file and running it without internet access?

Thanks a lot!!

jwindhager commented 1 year ago

I don't think that this would explain the segfault, but you will indeed need to download the model locally in case you don't have an internet connection. You can have a look how the this is solved in the steinbock Docker container (which ships with a copy of the model): https://github.com/BodenmillerGroup/steinbock/blob/bc10207a5a690254bb4e65089c06004fe3f19b05/Dockerfile#L167-L168

Then, if you use the steinbock Python package on the command line, you can specify the --modeldir parameter of the steinbock segment deepcell command (defaults to /opt/keras/models, matching above download instructions).

If you use the steinbock Python package from within a Python script/notebook, you can use the optional model argument to specify the Keras model instance, for example: https://github.com/BodenmillerGroup/steinbock/blob/bc10207a5a690254bb4e65089c06004fe3f19b05/steinbock/segmentation/_cli/deepcell.py#L148-L158

In theory, using steinbock with Singularity should work, but is untested (https://github.com/BodenmillerGroup/steinbock/issues/159). Maybe the current maintainer of steinbock, @Milad4849, can comment on whether this will be tested anytime in the foreseeable future? Otherwise, if you are willing to give this a try, it would be very helpful to know if this works for you!

Milad4849 commented 8 months ago

Closing due to inactivity