MouseLand / cellpose

a generalist algorithm for cellular segmentation with human-in-the-loop capabilities
https://www.cellpose.org/
BSD 3-Clause "New" or "Revised" License
1.31k stars 376 forks source link

Details on the application of this segmentation tool #1

Closed YubinXie closed 4 years ago

YubinXie commented 4 years ago

Hi CellPose team,

Thank you for your amazing tool! I found it very promising. I have some questions about the application of this tool in practice.

  1. What kind of images work best for the pre-trained model? 1.1 I found most of the images shown in the paper are either nuclei image or 1 cytoplasm and 1 nucleus RGB images. Does this tool work on multiplexed images? 1.2 There is a parameter about 'diameter' on the website and I found all the images provided have similar cell types there. Will this tool work on images with very different cell types with different sizes? 1.3 Most of the images shown in the paper are in vitro. Have you tested on in vivo images?

  2. If we train our own model with new data, how will the answers for the previous questions be different?

I am very excited to see promising tools for cell segmentation and curious about these questions for a better understanding of the tool. Thank you so much!

marius10p commented 4 years ago

1.1 There are two models you can switch between, nuclei and cells. 'Cells' always needs a cytoplasm channel, and it can have an optional nuclear channel. You would probably need to average or max-project your multiplexed images.

1.2 What cell types do you have in mind? Look at the figures in the paper for a representative set of images. It is important to set the average cell size approximately right, but then it will still find cells of different sizes around that average .

1.3 We just took any images we could find, but you're welcome to contribute more images to the training dataset. The main in vivo images are neurons from calcium imaging experiments.

  1. You can in principle train with as many image input channels as you want, but then it will only work for images with exactly those channels. If all your images are from the same type of experiment, then the average cell size would be fixed and you don't need any resizing. We will release the notebooks for the full training pipeline, but you might have to wait a bit for that. Right now our priority is to develop the generalist model that can segment any type of image.
YubinXie commented 4 years ago

Thank you for your detailed reply.

So for tumors, the average can be 30 pixels while immune can be 10 pixels. When you mention that it will still find cells of different sizes around that average, will it work in this case?

marius10p commented 4 years ago

I think so, but I can't really say without seeing an image. You'd have to try it, it's real easy if you just use the website (www.cellpose.org).

YubinXie commented 4 years ago

I tried it and it misses small cells when it finds big cells. I was wondering if this can be fixed by new training samples or this is part of the fixed mechanism of the methods?

marius10p commented 4 years ago

Can you show an example please? It's really hard to judge without seeing the results.

YubinXie commented 4 years ago

so for images with d=10, d=20, we can see every different result. It seems it is very parameters sensitive. It is less likely a problem for in vitro one cell type image. But for in vivo tissue, it can be a problem. But of course, the model has been amazing already. It is about the details.

d10 d20

marius10p commented 4 years ago

Thanks for posting these results. By eye to me this image looks relatively ambiguous. Do you think you can confidently tell what the correct segmentation should have been? If so, it would be a good idea to contribute a manual segmentation of this to our training dataset via the GUI upload (see instructions in the readme).

The main difference I see between the segmentations is that some ROIs get split or shrink a little with the smaller diameter, so if you prefer one segmentation over the other you can just choose the best d for you.

YubinXie commented 4 years ago

The big d finds all the big cells while grouping two small cells into one. The small d misses big cells.

My main concern is that the parameter here seems to have a very strong influence and I am not sure if this can be improved by more training data.

marius10p commented 4 years ago

d is not a hyperparameter. It is meant to be set as the average cell diameter, whatever that might be. You can either use the automatic calibration, or input it yourself.

It looks to me from your image that mostly the same cells are missed in both cases. This is simply because some cells are very out of focus which makes it impossible to segment them. The algorithm is automatically excluding them for you. In fact, I cannot tell in most cases where the cell boundaries are, can you? Have you tried d=15?

YubinXie commented 4 years ago

All diameters have their own advantage and disadvantage but I think it is true that the data are not perfect. I think the model is good enough. Thanks!