Setting up Omnipose with GPU support results in kernel crashes when cellpose_omni is imported

Marco-J-K commented 11 months ago

Hello, when installing omnipose as explained in the readme, I had a problem with importing cellpose_omni. It makes the kernel crash.

In jupyter notebook, when running the cell that imports cellpose_omni or any module from it, it just crashes and immediately the kernel restarts.

In command line python, when importing cellpose_omni or any module from it, the python kernel crashes and displays this error:

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.

OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

I didn't try the "unsafe, unsupported, undocumented workaround", but I think I found the underlying issue.

This is how I set up omnipose with GPU support on my PC according to the readme:

Set up the conda environment: conda create -n omnipose 'python==3.10.12' pytorch torchvision pytorch-cuda=11.8 -c pytorch -c nvidia
Activate: conda activate omnipose
Install omnipose: pip install omnipose

First of all, the omnipose install reinstalls torch and torchvision without cuda, so even after fixing the kernel crash, there is no GPU support with this installation procedure.

Second, I tracked down the issue with the kernel crash to the numpy package. It's not a problem with the version, but somehow the numpy installation. Uninstalling numpy and reinstalling the same numpy version (1.26.0), fixes the problem. When importing cellpose_omni now, there is an error that the natsort package is missing, but after installing this as well the cellpose_omni import works and doesn't cause the kernel crash anymore (installing natsort before the numpy uninstall+reinstall is not sufficient to fix the issue).

When setting everything up without the GPU support, there is no kernel crash problem: conda create -n omnipose 'python==3.10.12' pytorch. Setting up the GPU support after that also doesn't cause the kernel crash, and cuda is installed properly. That's how I did it:

pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118

You might have have to uninstall torch and torchvision before installing the cuda version.

I'm not sure what exactly causes the problem, but it looks like the order of installation matters.

kevinjohncutler commented 9 months ago

Thanks for reporting this, @Marco-J-K. I will look into this soon - likely a Windows specific specific issue.

kevinjohncutler commented 9 months ago

So in combination with #65 and #66, there might be multiple dependency issues going on. @tensorcoder, was your issue also on windows? Something to be really careful about is the version of python in your conda environment vs your base. There can be crosstalk between the two and really screw up package versions.

khuongtran78 commented 8 months ago

@kevinjohncutler I installed omnipose in the base environment and still had the same issue.

kevinjohncutler commented 8 months ago

@khuongtran78 Did you also have an older version of omnipose installed in base at any point?

kevinjohncutler / omnipose

Setting up Omnipose with GPU support results in kernel crashes when cellpose_omni is imported #63