Closed WeilerP closed 10 months ago
I am also running into a similar issue where cellpose is using in excess of 500GB of ram on a relatively small image.
Here is the batch script I am using:
#!/bin/sh
#SBATCH --qos=generic_qos
#SBATCH --gres=gpu:1
#SBATCH --partition=generic_partition
#SBATCH --account=generic_account
#SBATCH --nodes=1
#SBATCH --time=24:00:00
#SBATCH --ntasks=20
#SBATCH --mem=500G
#SBATCH --output=/path/to/log/cellpose_run.log
#SBATCH --export=ALL
### Run your command
. /path/to/anaconda3/etc/profile.d/conda.sh
conda activate cellpose
dir="/path/to/raw_images/processed"
image_path="/path/to/rotated_cropped_data/image.tif"
save_path="/path/to/trained_model_outputs/256_default"
model_path="/path/to/rotated_cropped_data/256_crops/models/model_name"
model="model_name"
python -m cellpose --dir $dir --pretrained_model $model --savedir $save_path --add_model $model_path --do_3D --no_npy --save_tif --verbose --use_gpu
When I am running everything through a Python script, I can successfully run the segmentation with a peak memory usage of less than 150GB.
This is interesting and suggests some kind of memory bug in the CLI version. I'll look into this
Hello, has anyone been able to solve this issue? I am running into the same issues trying to run Cellpose on our Xenium data.
Tagging for visibility: @WeilerP @Myrkgod @mrariden
Hello, has anyone been able to solve this issue? I am running into the same issues trying to run Cellpose on our Xenium data.
Tagging for visibility: @WeilerP @Myrkgod @mrariden
If you only need the masks you can try running cellpose through a python script and save the mask output of model.eval() using tifffile.imwrite(). That's what I am doing here but I also have some extra stuff for tiling the image and running predictions in parallel, though I haven't implemented restitching the segmentations together. It's also pretty much the same as the code @WeilerP is using but use tifffile.imwrite() instead of io.save_masks.
@mrariden I was also wrong about the discrepancy between CLI and GUI memory use, I'm not sure what was the reason for the decreased use in the GUI at the time. Though, I do find it weird that using save_masks takes up so much memory when the entire mask should be a few GBs in size. Also is there any downside to using tifffile to save the masks if they are all I need?
@parkjosh-broadinstitute, @WeilerP For the moment:
I've not been able to identify the issue with the OOM and it might be on the interface with bash/SLURM. My recommendation for now is to run cellpose via a python script, since that seems to be working in most cases (mine included).
If you really want to use the CLI, then you can manually tile your images with an overlap and save the flows. The scripts I have are not general enough to share but you can stitch the flows together, divide out the overlapping regions, and run dynamics.compute_masks()
on the full size, stitched flows. This gets around the memory error by saving tiles to disk instead of holding them in memory.
@Myrkgod the .npy file currently holds data for the entire image itself plus the masks, flows, and outlines. In an upcoming PR, we're removing the image data to speed up saving the .npy file to disk. There's no issue with using tifffile to save masks if that solves your problem.
If you really want to use the CLI, then you can manually tile your images with an overlap and save the flows. The scripts I have are not general enough to share but you can stitch the flows together, divide out the overlapping regions, and run
dynamics.compute_masks()
on the full size, stitched flows. This gets around the memory error by saving tiles to disk instead of holding them in memory.
@mrariden Sorry this is unrelated to the original issue, but do you know if this will perform the same as running the model on smaller crops? My image seems to perform better on smaller crops due to a range of cell sizes and brightness so I'd like to split them and run it separately.
Edit: I've not tried this, but from my experience with Omnipose I am going to say yes, it will perform the same or similar.
@WeilerP can you checkout the solution here?
From my research, the memory issue is related to the matplotlib figure creation, which isn't usually needed as an output. I've removed it and saw much better memory usage and run times. I think this will be how we implement the solution.
This should be resolved with the latest merge.
@carsen-stringer, is there a guideline on the expected memory usage? I was following this guide from 10x to process a Xenium sample but I keep running out of memory when running cellpose with the CLI. So far, I requested up to 800GB of RAM but still ran out of memory.
To extract and save the stack level of interest, I am using
which gives me an image of size
(12, 10208, 7814)
. I then run cellpose via the CLI usingIs there something inherently wrong with this pipeline that would explain the large memory usage?
When I am running everything through a Python script, I can successfully run the segmentation with a peak memory usage of less than 150GB.
I am running everything in Python 3.9 with
All packages and versions
``` # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu anyio 3.7.0 pypi_0 pypi argon2-cffi 21.3.0 pypi_0 pypi argon2-cffi-bindings 21.2.0 pypi_0 pypi arrow 1.2.3 pypi_0 pypi asttokens 2.2.1 pypi_0 pypi async-lru 2.0.2 pypi_0 pypi attrs 23.1.0 pypi_0 pypi babel 2.12.1 pypi_0 pypi backcall 0.2.0 pypi_0 pypi beautifulsoup4 4.12.2 pypi_0 pypi bleach 6.0.0 pypi_0 pypi ca-certificates 2023.05.30 h06a4308_0 cellpose 2.2.2 pypi_0 pypi certifi 2023.5.7 pypi_0 pypi cffi 1.15.1 pypi_0 pypi charset-normalizer 3.1.0 pypi_0 pypi cmake 3.26.4 pypi_0 pypi comm 0.1.3 pypi_0 pypi debugpy 1.6.7 pypi_0 pypi decorator 5.1.1 pypi_0 pypi defusedxml 0.7.1 pypi_0 pypi exceptiongroup 1.1.1 pypi_0 pypi executing 1.2.0 pypi_0 pypi fastjsonschema 2.17.1 pypi_0 pypi fastremap 1.13.5 pypi_0 pypi filelock 3.12.2 pypi_0 pypi fqdn 1.5.1 pypi_0 pypi idna 3.4 pypi_0 pypi imagecodecs 2023.3.16 pypi_0 pypi importlib-metadata 6.7.0 pypi_0 pypi ipykernel 6.23.2 pypi_0 pypi ipython 8.14.0 pypi_0 pypi ipywidgets 8.0.6 pypi_0 pypi isoduration 20.11.0 pypi_0 pypi jedi 0.18.2 pypi_0 pypi jinja2 3.1.2 pypi_0 pypi json5 0.9.14 pypi_0 pypi jsonpointer 2.4 pypi_0 pypi jsonschema 4.17.3 pypi_0 pypi jupyter-client 8.2.0 pypi_0 pypi jupyter-core 5.3.1 pypi_0 pypi jupyter-events 0.6.3 pypi_0 pypi jupyter-lsp 2.2.0 pypi_0 pypi jupyter-server 2.6.0 pypi_0 pypi jupyter-server-terminals 0.4.4 pypi_0 pypi jupyterlab 4.0.2 pypi_0 pypi jupyterlab-pygments 0.2.2 pypi_0 pypi jupyterlab-server 2.23.0 pypi_0 pypi jupyterlab-widgets 3.0.7 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 libffi 3.4.4 h6a678d5_0 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libstdcxx-ng 11.2.0 h1234567_1 lit 16.0.6 pypi_0 pypi llvmlite 0.40.1rc1 pypi_0 pypi markupsafe 2.1.3 pypi_0 pypi matplotlib-inline 0.1.6 pypi_0 pypi mistune 2.0.5 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi natsort 8.3.1 pypi_0 pypi nbclient 0.8.0 pypi_0 pypi nbconvert 7.5.0 pypi_0 pypi nbformat 5.9.0 pypi_0 pypi ncurses 6.4 h6a678d5_0 nest-asyncio 1.5.6 pypi_0 pypi networkx 3.1 pypi_0 pypi notebook-shim 0.2.3 pypi_0 pypi numba 0.57.0 pypi_0 pypi numpy 1.24.3 pypi_0 pypi nvidia-cublas-cu11 11.10.3.66 pypi_0 pypi nvidia-cuda-cupti-cu11 11.7.101 pypi_0 pypi nvidia-cuda-nvrtc-cu11 11.7.99 pypi_0 pypi nvidia-cuda-runtime-cu11 11.7.99 pypi_0 pypi nvidia-cudnn-cu11 8.5.0.96 pypi_0 pypi nvidia-cufft-cu11 10.9.0.58 pypi_0 pypi nvidia-curand-cu11 10.2.10.91 pypi_0 pypi nvidia-cusolver-cu11 11.4.0.1 pypi_0 pypi nvidia-cusparse-cu11 11.7.4.91 pypi_0 pypi nvidia-nccl-cu11 2.14.3 pypi_0 pypi nvidia-nvtx-cu11 11.7.91 pypi_0 pypi opencv-python-headless 4.7.0.72 pypi_0 pypi openssl 3.0.8 h7f8727e_0 overrides 7.3.1 pypi_0 pypi packaging 23.1 pypi_0 pypi pandocfilters 1.5.0 pypi_0 pypi parso 0.8.3 pypi_0 pypi pexpect 4.8.0 pypi_0 pypi pickleshare 0.7.5 pypi_0 pypi pip 23.1.2 py39h06a4308_0 platformdirs 3.6.0 pypi_0 pypi prometheus-client 0.17.0 pypi_0 pypi prompt-toolkit 3.0.38 pypi_0 pypi psutil 5.9.5 pypi_0 pypi ptyprocess 0.7.0 pypi_0 pypi pure-eval 0.2.2 pypi_0 pypi pycparser 2.21 pypi_0 pypi pygments 2.15.1 pypi_0 pypi pyrsistent 0.19.3 pypi_0 pypi python 3.9.16 h955ad1f_3 python-dateutil 2.8.2 pypi_0 pypi python-json-logger 2.0.7 pypi_0 pypi pyyaml 6.0 pypi_0 pypi pyzmq 25.1.0 pypi_0 pypi readline 8.2 h5eee18b_0 requests 2.31.0 pypi_0 pypi rfc3339-validator 0.1.4 pypi_0 pypi rfc3986-validator 0.1.1 pypi_0 pypi roifile 2023.5.12 pypi_0 pypi scipy 1.10.1 pypi_0 pypi send2trash 1.8.2 pypi_0 pypi setuptools 67.8.0 py39h06a4308_0 six 1.16.0 pypi_0 pypi sniffio 1.3.0 pypi_0 pypi soupsieve 2.4.1 pypi_0 pypi spalma 0.0.0 pypi_0 pypi sqlite 3.41.2 h5eee18b_0 stack-data 0.6.2 pypi_0 pypi sympy 1.12 pypi_0 pypi terminado 0.17.1 pypi_0 pypi tifffile 2023.4.12 pypi_0 pypi tinycss2 1.2.1 pypi_0 pypi tk 8.6.12 h1ccaba5_0 tomli 2.0.1 pypi_0 pypi torch 2.0.1 pypi_0 pypi tornado 6.3.2 pypi_0 pypi tqdm 4.65.0 pypi_0 pypi traitlets 5.9.0 pypi_0 pypi triton 2.0.0 pypi_0 pypi typing-extensions 4.6.3 pypi_0 pypi tzdata 2023c h04d1e81_0 uri-template 1.2.0 pypi_0 pypi urllib3 2.0.3 pypi_0 pypi wcwidth 0.2.6 pypi_0 pypi webcolors 1.13 pypi_0 pypi webencodings 0.5.1 pypi_0 pypi websocket-client 1.6.0 pypi_0 pypi wheel 0.38.4 py39h06a4308_0 widgetsnbextension 4.0.7 pypi_0 pypi xz 5.4.2 h5eee18b_0 zipp 3.15.0 pypi_0 pypi zlib 1.2.13 h5eee18b_0 ```Thanks in advance for your help/input and let me know if you need/want any other information!