yassinharim commented 2 months ago

I've been using Cellpose 2.2.2 to segment rather large images with tens of thousands of cells. Now I upgraded to Cellpose 3.0.7 to try out the new denoising/deblurring features, however, Cellpose 3 does not seem to handle the large files in the same way.

I am using using a laptop with a Intel Core i7-13700H and Nvidia RTX 4050 6GB to segment, for example, an image that is about 9000x7000 pixels large and contains around 75,000 cells to be segmented.

In Cellpose 2, if I remember correctly, would trigger a message in the command line along the lines of "image too large, computing masks to flows on CPU" and switch to the slower CPU processing, but successfully complete it eventually.
In Cellpose 3, it simply fails with the error torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.71 GiB. GPU 0 has a total capacty of 6.00 GiB of which 1.16 GiB is free. Of the allocated memory 3.69 GiB is allocated by PyTorch, and 90.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF If I switch off GPU processing manually, it segments the image successfully using the CPU.

Is there any way to switch back to the old behaviour in Cellpose 3? For me, this new behaviour is more inconvenient - I would like to use the GPU, if possible, and usually I use Cellpose via CLI to batch-process lots of images. Without the automatic GPU/CPU switching, I'd always need to estimate or manually check if it is small enough to run on the GPU, and then change it on a per-run basis.

Thanks a lot!

mayishazn commented 1 month ago

Mine is doing the same thing but for much smaller images 2048x2048. It seems it does not clear memory between segmentations. I dont have a "real" fix but this is the notebook I'm running for now

import numpy as np import gc import time, os, sys from urllib.parse import urlparse import skimage.io import matplotlib.pyplot as plt import matplotlib as mpl mpl.rcParams['figure.dpi'] = 300 from cellpose import utils from cellpose import plot from cellpose import models from cellpose import models, io from urllib.parse import urlparse from cellpose import models, core from cellpose.io import logger_setup from cellpose import denoise, io from scipy.io import savemat logger_setup(); import re

diam = 50 minarea = 200 filetype = '.tif' model = models.CellposeModel(gpu=True, model_type = 'CPx') channels = [0,0] # IF YOU HAVE GRAYSCALE

folders = 'D:\CellImages'`

def list_files(dir): r = []

all_data = list()

for root, dirs, files in os.walk(dir):
    for name in files:
        if name.endswith(filetype):
            r0=os.path.join(root,name)
            if "20x" in r0 and "C3" in r0:
            #    print(r0)
                r.append(r0)
             #   all_data.append(skimage.io.imread(r0, as_gray=True))
return r

namess = list_files(folders)

def partitioncellpose(namelist): pace=5 part = int(np.ceil(len(namelist)/pace)) lastindx = 0 for index in range(part): r = [] all_data = list() if index==part: names = namelist[lastindx:-1] else: names = namelist[lastindx:lastindx+pace] lastindx = lastindx+5 for impath in names: r.append(impath) all_data.append(skimage.io.imread(impath, as_gray=True)) imgs = all_data nimg = len(imgs)

model = denoise.CellposeDenoiseModel(gpu=True, model_type="cyto3", restore_type="denoise_cyto3")

masks, flows, styles = model.eval(imgs, diameter=diam, channels=channels, flow_threshold=0.8,cellprob_threshold=-6, do_3D=False, min_size=minarea, resample=True, progress=True)

    #masks, flows, styles, imgs_dn = model.eval(imgs, diameter=diam, channels=channels, flow_threshold=0.8, cellprob_threshold=-6, do_3D=False, min_size=minarea, resample=True)
    #io.masks_flows_to_seg(imgs_dn, masks, flows, names, diam, channels )
    masks, flows, styles = model.eval(imgs, diameter=diam, channels=channels, flow_threshold=0.8, cellprob_threshold=-6, do_3D=False, min_size=minarea, resample=True)
    io.masks_flows_to_seg(imgs, masks, flows, names, diam, channels )
    #masks_flows_to_seg(images, masks, flows, file_names, diams, channels, imgs_restore, restore_type, ratio)

    for file in names:
        name = re.sub('.tif$','_seg.npy',file)
        dat = np.load(name, fix_imports=True,allow_pickle=True).item()
        name = re.sub('npy$', 'mat', name)
        savemat(name, dat)
    del styles
    print(names)
    del names
    del imgs
    gc.collect()
return 0

partitioncellpose(namess)

I left the commented stuff intentionally because if you use the denoising+cyto3, it returns the denoised image as well.

mrariden commented 1 month ago

@yassinharim some of the denoising features in CP3 are achieved with a CNN. The weights for the CNN and the actual CPnet and the image data all have to be held in GPU memory while evaluating. So, I'm not surprised that large images exceed the memory capacity. To confirm, if you use the version 3 gui and then only run the cyto3 model do you get memory errors?

We will look into this issue

yassinharim commented 1 month ago

@yassinharim some of the denoising features in CP3 are achieved with a CNN. The weights for the CNN and the actual CPnet and the image data all have to be held in GPU memory while evaluating. So, I'm not surprised that large images exceed the memory capacity. To confirm, if you use the version 3 gui and then only run the cyto3 model do you get memory errors?

We will look into this issue

Thanks a lot for your reply @mrariden! Actually I did not test this with any version 3 features like denoising or the cyto3 model - I was just executing Cellpose via CLI to run a model that I trained based on the nuclei model. And the thing that I'm wondering about is that it used to work on version 2 because it would automatically switch to CPU processing - but in version 3, it simply fails and stops.

Since I'm quantifying nuclei, I don't think the cyto3 model would yield good results. Do you still want me to run cyto3 in the version 3 GUI just for troubleshooting issues, or was it just to clarify the scenario where the error occurred?

MouseLand / cellpose

Cellpose 3 runs out of GPU memory where Cellpose 2 didn't #918

all_data = list()

model = denoise.CellposeDenoiseModel(gpu=True, model_type="cyto3", restore_type="denoise_cyto3")

masks, flows, styles = model.eval(imgs, diameter=diam, channels=channels, flow_threshold=0.8,cellprob_threshold=-6, do_3D=False, min_size=minarea, resample=True, progress=True)