Closed VasylVaskivskyi closed 3 years ago
The two most important factors I saw for that segmentation model are the zoom of the image and whether or not it was deconvolved. Are your images at 10x or 20x and not something closer to 40x? If so, are you using the deconvolution preprocessing?
Thank you for a quick reply. The images are 20x, I run deconvolution and drift compensation. Here is a config file that I use.
date: '2021-07-26 16:50:13'
environment: {path_formats: keyence_multi_cycle_v01}
name: e3aa11ba0218456e2cc9302f6b1d9d1c
acquisition:
axial_resolution: 1500.0
channel_names: [DAPI-01, Blank, Blank, Blank, DAPI-02, CD31, CD8, CD45, DAPI-03,
CD20, Ki67, CD3e, DAPI-04, Actin, Podoplan, CD68, DAPI-05, PanCK, CD21, CD4, DAPI-06,
Empty, CD45RO, CD11c, DAPI-07, Empty, E-CAD, CD107a, DAPI-08, Empty, CD44, H3,
DAPI-09, Blank, Blank, Blank]
emission_wavelengths: [358, 750, 550, 650]
lateral_resolution: 377.40384615384613
magnification: 20
num_cycles: 9
num_z_planes: 1
numerical_aperture: 0.75
objective_type: air
per_cycle_channel_names: [CH1, CH2, CH3, CH4]
region_height: 10
region_names: [1]
region_width: 10
tile_height: 1000
tile_overlap_x: 200
tile_overlap_y: 200
tile_width: 1000
tiling_mode: grid
analysis:
- aggregate_cytometry_statistics: {mode: best_z_plane}
operator:
- extract:
channels: [proc_DAPI-02, proc_CD8, proc_CD20,
proc_Ki67, proc_CD3e, proc_CD21, proc_CD4, proc_CD45RO, proc_CD11c, proc_E-CAD,
proc_CD107a]
name: expressions
z: all
processor:
args:
gpus: [0, 1]
memory_limit: 64G
run_best_focus: true
run_crop: false
run_cytometry: true
run_deconvolution: true
run_drift_comp: true
run_tile_generator: true
best_focus: {channel: DAPI-02}
cytometry:
membrane_channel_name: CD45
nuclei_channel_name: DAPI-02
quantification_params: {cell_graph: true, nucleus_intensity: true}
segmentation_params: {marker_dilation: 3, marker_min_size: 2, memb_gamma: 0.25,
memb_min_dist: 8, memb_sigma: 5}
target_shape: [1000, 1000]
deconvolution: {n_iter: 25, scale_factor: 0.5}
drift_compensation: {channel: DAPI-02}
tile_generator: {raw_file_type: keyence_mixed}
I'd suggest removing cytometry.segmentation_params
as well as maybe cytometry.target_shape
(that should be OK as-is but no need to keep that around in case you change run_crop
, in which case the image would be upsampled) and then trying again with run_deconvolution: false
.
Honestly, I wasn't ever able to convince myself that deconvolution helped our CODEX quantifications (despite being visually much clearer) so I often disabled it. By this I mean I had tried segmenting the original images (which is generally better) and quantifying the deconvolved images separately, but that didn't help either so I never added that feature to the library.
I tried to run with cytometry.segmentation_params
and cytometry.target_shape
disabled, but it didn't improve the result. However, turning off deconvolution improved results quite significantly - Cytokit segmented three times more nuclei.
Nice! Anything else I can help with then before closing this out?
Yes, are there any other ways to improve the quality of segmentation results? Because, Cytokit still doesn't segment around a half of the nuclei in some datasets, although their borders are clearly visible. Here is a bad region (image quality is low because it is a screenshot from remote VM)
hmm can you share an image at its original resolution?
I can give you a link to GDrive with original image. I extracted DAPI channel and nucleus labels, but images still take up together 500 MB. [link removed]
I'm not immediately sure what's going on, but I'd certainly expect a lot of those missed nuclei to be getting picked up on good images like that. A few suggestions:
from cytokit.cytometry.cytometer import Cytometer2D
cytometer = Cytometer2D(img_nuc.shape + (1,), target_shape=(1344, 1344)).initialize()
# Returns objects (as labelled images), predictions from U-Net, and binarized nuclei image
img_seg, img_pred, img_bin = cytometer.segment(dapi_image, return_masks=True, min_size=12, nucleus_dilation=8)
That last one would let you try different segmentation parameters interactively. That might help but you'd have to be willing to read the code for it and I don't have an intuition for what's worth attempting since it's been a while.
Thank you for you help. I will have a look at your suggestions regarding different UNet implementations. As for the last option, I think my colleagues and I already tested all parameters for Cytokit that are available in yaml config, and the ones that I enclosed in the previous comment were working well for old datasets.
Just came here because of a notification as unet-nuclei was mentioned. These days, I'd probably just use cellpose or Stardist for nuclei segmentation, they often give very good results out of the box, otherwise after a tiny bit of retraining. I don't know anything about cytokit though, so not sure how difficult it would be to integrate StarDist or cellpose.
We will probably switch to Deepcell. However, it has a problem, same as Cellpose — the nucleus and cytoplasm segmentation results do not match. So, there are lots of instances where nucleus label is present, but cytoplasm label is not, or vice versa, or nucleus goes beyond the borders of cytoplasm. Apart from that, there is also some development that has to be done to replace Cytokit inside the already established pipeline. That's why we wanted to see if there is anything can be done keep things as is.
GTK @VolkerH -- you still using either one of those heavily yourself? I'd be curious to hear about your experience in retraining them.
the nucleus and cytoplasm segmentation results do not match
I don't know what the state-of-the art is for this, but FWIW I thought the CP algorithm for this (from this old Anne Carpenter paper) was actually pretty solid. It has one regularization parameter to tune, but it's at least intuitive. The centrosome lib it's in is also nice and lightweight, and invocation only requires a labeled nucelus image, an optional (I think) mask, and the cytoplasm image (e.g. cytometer.py#L1008). I imagine it being easy for you to integrate directly downstream from deepcell, stardist, etc.
I did use both Stardist and Cellpose quite a bit in my previous role, but often referring facility users to it and running a few of their sample images through it. If often solved their particular issue without tuning, so that was very convenient.
In the lab I work currently, we use Cellpose as part of our workflow but I don't use it much myself. My experimental colleagues who had some cells that were not initially segmented well obtained good results after annotating a few images and retraining. However, these colleagues are mainly interested in segmenting the whole cell using a cytoplasm model. I cannot really comment on how well it works segmenting nuclei and corresponding cells together consistently, which seems to be the problem @VasylVaskivskyi encountered.
In the classical workflows you don't usually have that problem as you use the detected nuclei as the seed points for finding the cytoplasmic area, so the association between nucleus and cytoplasm is fixed by design (doesn't work very well for poly-nucleated phenotypes though).
Thank you very much for the suggestions and discussion.
@eric-czech I will try centrosome.propagate
on the nucleus segmentation results from DeepCell and Cellpose.
@VolkerH Quite likely we won't do retraining on our data. The datasets come from different tissue, organs, providers, so it is too much work to annotate all the combinations, and there is nobody to do it anyway.
I've noticed that Cytokit segmentation performs poorly on some CODEX datasets. Is there a way to change a threshold that tells which nuclei are allowed to pass. Or maybe there are some other parameters that can influence the quality of nucleus segmentation? I could only find some options that influence size of the nuclei masks.