Closed remyeltorro closed 2 months ago
@remyeltorro can you share a small complete example that illustrates the problem you're having? I'll be better able to help with the actual code at hand
@mrariden thank you very much, I'll try to elaborate:
The idea: pass two or more 'cyto-like' channels to Cellpose, without performing RGB merging or other preprocessing. The info on how to segment the cells is on more than one channel and I want this as the input to the underlying UNET. I want to keep everything else the same.
Say I produced a model with the following CLI:
python -m cellpose --use_gpu --train --dir /home/limozin/Desktop/segment_RICM/annotations/dataset/train/augmented/ --test_dir /home/limozin/Desktop/segment_RICM/annotations/dataset/train/test/ --learning_rate 0.001 --weight_decay 0.0001 --n_epochs 5000 --mask_filter _labelled --verbose --no_norm --batch_size 8 --all_channels
The model detects the number of channels in my images and trains smoothly, I think it does what I intended, after tracking down in the source code what the --all_channels tag does.
Then if I want to load the model for prediction, it'll look something like this:
model = CellposeModel(gpu=use_gpu, pretrained_model=model_complete_path, diam_mean=30.0,
#model_type=None,
)
And I'll predict using:
Y_pred, _, _ = model.eval([f], diameter = diameter,
flow_threshold = flow_threshold,
normalize=False,
channels=None,
)
where f is a two channel image XYC (say shape 1004x1002x2). It seems, from the source code, that the only way to activate the "all_channels" model is to pass channels=None, but doing this will trigger the
TypeError: object of type 'NoneType' has no len()
on the line I indicated above.
After removing the if condition, and forcing the UNET-eval function triggered by the eval function above to pass channels=None, it seems to do what I intended, which is to use both of my channels inside the UNET to predict the target mask (and not decomposed as cyto+nucleus).
I don't know if it's clearer, maybe I misunderstood how to do what I intended. I there another way to achieve this without modifying the source code?
Thanks again,
@remyeltorro Based on the CLI command you used, it looks like you're training based off the cyto
model because you did not use the --pretrained None
flag. This must mean that your images have 2 channels since the --all_channels
flag didn't produce an error and the cyto model is 2 channel. I say this because the --all_channels
flag appears to be broken, at least for my data with 3 channels. But let me know if I'm off on this.
Still, if you are content with using 2 channels to segment, then CP should be capable of that. The built-in models assume that you have a cellular stain and a nuclei stain, but there isn't any strict dependence on that. Since the image channels get passed into the net together, they could mark arbitrary components of the cell provided they are sufficient for segmentation and you have enough training data. In other words, you can pass 2 channels into CP that both represent the cytoplasm and train on your masks. As CP is written now, you'll have to call one of the channels a 'nuclei' channel, but that is based on the pretrained model nomenclature.
Going to 3+ channels isn't really (well) supported, but it is something that we are discussing implementing/improving.
You are totally right, I did not pay attention to the --pretrained_model
parameter on the CLI command. I also revisited the problem with 3-channel images to be sure not to take a default number of channels of 1 or 2. The new CLI command is as follow:
python -m cellpose --use_gpu --train --dir /home/limozin/Desktop/MCF7_nuclei_cellpose/dataset/train/augmented/ --test_dir /home/limozin/Desktop/MCF7_nuclei_cellpose/dataset/train/test/ --learning_rate 0.001 --weight_decay 0.0001 --n_epochs 5000 --mask_filter _labelled --verbose --no_norm --batch_size 8 --all_channels --pretrained_model None
The two relevant warnings I got are:
2023-08-09 10:23:35,699 [WARNING] channels is set to None, input must therefore have nchan channels (default is 2)
and right below
2023-08-09 10:23:36,000 [INFO] >>>> training network with 3 channel input <<<<
The model seems to train normally. No error. Here's the full log:
2023-08-09 10:28:30,732 [INFO] WRITING LOG OUTPUT TO /home/limozin/.cellpose/run.log
2023-08-09 10:28:30,732 [INFO]
cellpose version: 2.2.2
platform: linux
python version: 3.11.3
torch version: 2.0.0+cu118
2023-08-09 10:28:31,573 [INFO] ** TORCH CUDA version installed and working. **
2023-08-09 10:28:31,573 [INFO] >>>> using GPU
2023-08-09 10:28:31,950 [INFO] 152 / 152 images in /home/limozin/Desktop/MCF7_nuclei_cellpose/dataset/train/augmented/ folder have labels
2023-08-09 10:28:32,019 [INFO] 25 / 25 images in /home/limozin/Desktop/MCF7_nuclei_cellpose/dataset/train/test/ folder have labels
2023-08-09 10:28:32,019 [INFO] >>>> during training rescaling images to fixed diameter of 30.0 pixels
2023-08-09 10:28:34,645 [INFO] flows precomputed
2023-08-09 10:28:34,789 [INFO] flows precomputed
2023-08-09 10:28:34,822 [WARNING] 3 train images with number of masks less than min_train_masks (5), removing from train set
2023-08-09 10:28:34,822 [WARNING] channels is set to None, input must therefore have nchan channels (default is 2)
2023-08-09 10:28:35,152 [INFO] >>>> median diameter set to = 30
2023-08-09 10:28:35,152 [INFO] >>>> mean of training label mask diameters (saved to model) 42.566
2023-08-09 10:28:35,152 [INFO] >>>> training network with 3 channel input <<<<
2023-08-09 10:28:35,152 [INFO] >>>> LR: 0.00100, batch_size: 8, weight_decay: 0.00010
2023-08-09 10:28:35,152 [INFO] >>>> ntrain = 149, ntest = 25
2023-08-09 10:28:35,152 [INFO] >>>> nimg_per_epoch = 149
2023-08-09 10:28:40,522 [INFO] Epoch 0, Time 5.4s, Loss 2.1554, Loss Test 1.8972, LR 0.0000
2023-08-09 10:28:43,297 [INFO] saving network parameters to /home/limozin/Desktop/MCF7_nuclei_cellpose/dataset/train/augmented/models/cellpose_residual_on_style_on_concatenation_off_augmented_2023_08_09_10_28_34.822385
2023-08-09 10:28:54,703 [INFO] Epoch 5, Time 19.6s, Loss 1.8863, Loss Test 1.3957, LR 0.0006
2023-08-09 10:29:08,875 [INFO] Epoch 10, Time 33.7s, Loss 1.3083, Loss Test 1.0401, LR 0.0010
2023-08-09 10:29:37,171 [INFO] Epoch 20, Time 62.0s, Loss 0.9777, Loss Test 0.8064, LR 0.0010
I'll let this model train and then try the prediction step, where it was breaking earlier.
I applied this three-channel-input model to segment other three channel images. I could load the model using:
model = CellposeModel(gpu=True, model_type=None, pretrained_model="/home/limozin/Desktop/MCF7_nuclei_cellpose/dataset/train/augmented/models/cellpose_residual_on_style_on_concatenation_off_augmented_2023_08_09_10_28_34.822385",
diam_mean=30.0, nchan=3)
It was important to pass nchan=3 in order not to trigger the following error:
size mismatch for downsample.down.res_down_0.proj.1.weight: copying a param with shape torch.Size([32, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 2, 1, 1]).
Then I loaded and applied the model to some images that are 3-channel, and normalized:
files = glob('/home/limozin/Desktop/MCF7_nuclei_cellpose/dataset/train/test/*.tif')
for f in files:
if not f.endswith('_flows.tif') and not f.endswith('_labelled.tif'):
stack = imread(f)
stack = np.moveaxis(stack, 0, -1)
stack = stack[:,:,[0,1,2]]
print(stack.shape) #(512,512,3)
Y_pred, _, _ = model.eval([stack], diameter = model.diam_labels, flow_threshold=0.4, channels=None, normalize=False)
Y_pred = Y_pred[0]
I experimented removing channels in the input data to see if the model can still take it, and it breaks for data that is 2-channel, and returns empty labels for data that is 1-channel. So it seems to do what was intended. Of course, all of this was done after removing the L504 if-condition, from my first comment.
If put back the original line 504, I can still load the 3-channel model. But I can't predict as the eval-method will break at this line 504 due to object of type 'NoneType' has no len()
, i.e. we're trying to evaluate the length of channels=None. So it seems to all come down to this little if-condition.
I suppose the 3 channel+ mode is almost working if it wasn't for the if statement on channels. Thanks again for investigating this question,
There appears to be a bug in the channel detection logic when using models that aren't the pretrained model size like you described. It does seem to work for me if you pass a single image in rather than a list of image. i.e.
import numpy as np
from cellpose.models import CellposeModel
img_chan = 5
img = np.random.random((256, 256, img_chan))
print(img.shape)
model = CellposeModel(gpu=True, model_type=None, pretrained_model=None, diam_mean=30.0, nchan=img_chan)
masks, flowp, style = model.eval(img, diameter = 30, flow_threshold=0.4, normalize=False, channels=None)
I've checked this with many different channel sizes. Hopefully this is a work around for you.
The passing a single image instead of a list works indeed, without triggering the L504 error. Thank you for helping me narrow down the combination of parameters that enables the 3+ multichannel feature.
Hi @carsen-stringer any update to this? Because on the CLI it is currently broken
Using the --all_channels
flag for training a custom model works, but specifying --all_channels
when using the CLI to use the custom model results in the same error. I saw you set this up properly for the training with
https://github.com/MouseLand/cellpose/blob/40a0c7d945b0cccc05c963cc8c349d9252bc0334/cellpose/__main__.py#L260
But this is not set here while you do have access to nchan with CellposeModel
https://github.com/MouseLand/cellpose/blob/40a0c7d945b0cccc05c963cc8c349d9252bc0334/cellpose/__main__.py#L157
So would it not be possible to set it here? https://github.com/MouseLand/cellpose/blob/40a0c7d945b0cccc05c963cc8c349d9252bc0334/cellpose/__main__.py#L163-L164
I can offer a PR if you wish
considering that all channels
is already available, this is a #bug and not a feature. @carsen-stringer it would be nice to get your input on this.
thanks @lacan I think it's been fixed, as long as I use images with the same number of channels as the model, e.g. I have RGB images in this folder and I'm running a three channel model:
python -m cellpose --dir images/ --all_channels --pretrained_model neurips_cellpose_default --verbose
Hello,
First of all, I'd like to thank you for the great work.
I'm highly interested in multimodality and information contained across more than one channel. In that spirit, I wanted to use the --all_channels option to train a custom Cellpose model on all the channels I provide, without restricting myself to only 1-cyto (and 1 optional nucleus channel). I passed the --all_channels tag to the CLI and everything seemed to work fine.
When I wanted to predict using this new model, on the other hand, I got stuck by an unfulfilled condition at line 504 of the cellpose.models.py file, in the eval function of the CellposeModel. From what I read of the source code, I can do inference with such a model if I pass _modeltype=None when loading the new model and channels=None when using the eval function.
I am not allowed to pass channels=None due to this condition, I get "TypeError: object of type 'NoneType' has no len()":
I manually erased the if condition and just passed "channels" and it seems to work. Do you think it would be possible to rework this line in a way that we can actually use Cellpose on any channels without modifying the source code?
Thank you very much for your attention,