Closed jwindhager closed 2 years ago
This is great, thanks! I'll review it next week.
Hey Team resp Jonas,
First look looks great - thanks a lot for your efforts!
On fist sights two points:
1) According to which rationale are image pre-processing steps put into the conversion step vs the CP pre-processing? Eg the hot pixel filtering was put back into the conversion step, but the 2x image scaling not?
2) If we have now the imcseg helper package: Could it make sense to factor out some of the more complicated cells of the notebook into higher level functions in the imcseg package? Could beneficial for testing and reuse.
Cheers!
On Sat, 12 Feb 2022, 21:46 Nils Eling, @.***> wrote:
This is great, thanks! I'll review it next week.
— Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/ImcSegmentationPipeline/pull/85#issuecomment-1037466770, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4R2KA7ADI3HDBI2FLP6N3U23BJ5ANCNFSM5OHO6O6Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Hey @votti,
thanks for checking this out! 1) This comes from trying to address #72. I noticed that CellProfiler v4.1.3 and v4.0.7 doesn't seem to write valid multi-channel .tiff files (see 4496) and CellProfiler v4.2.1 doesn't write multi-channel .tiff files at all (see 4495). By moving the hot pixel filtering out of CellProfiler, we would exclude mutli-channel .tiff generation via CP. I would still leave the scaling steps in the pipelines for readability reasons. 2) I find it ok since it's still readable. But maybe @jwindhager has comments?
Cheers, Nils
@jwindhager I noticed that the slide overview is written out as a _pano.png
file instead of the _slide.png
file when generating the ometiffs
folder. But the same already happened in the previous preprocessing script.
And the export of then _pano.png
files appears quite slow (~1min per file)
Hi @votti, thanks for the quick feedback!
1) According to which rationale are image pre-processing steps put into the conversion step vs the CP pre-processing? Eg the hot pixel filtering was put back into the conversion step, but the 2x image scaling not?
Moving the hot pixel filtering out of CellProfiler was requested by @nilseling, I assume for the reasons above.
2) If we have now the imcseg helper package: Could it make sense to factor out some of the more complicated cells of the notebook into higher level functions in the imcseg package? Could beneficial for testing and reuse.
I really don't have a strong opinion/preference here, but my time resources are limited. If you or anyone else thinks it would be good to have higher-level functions, I'd be happy to take pull requests on this branch.
@jwindhager I noticed that the slide overview is written out as a
_pano.png
file instead of the_slide.png
file when generating theometiffs
folder. But the same already happened in the previous preprocessing script.
IIRC, that's because the slide overview actually IS a panorama :-) I believe this may have changed with Fluidigm software releases.
And the export of then _pano.png files appears quite slow (~1min per file)
Indeed! I think this may be due to the high default PNG compression rate: https://github.com/imageio/imageio/issues/387
Hey @votti, a design question: is the measurement of the pixel probabilities in the last pipeline important? I haven't seen anyone using this information. If not I would suggest to directly write the filtered images into analysis/cpout/images
, the masks directly into analysis/cpout/masks
and import both these folders for the final measurement pipeline. I would make it slightly cleaner but you would loose the probability measurement.
Hey @votti, a design question: is the measurement of the pixel probabilities in the last pipeline important? I haven't seen anyone using this information. If not I would suggest to directly write the filtered images into analysis/cpout/images, the masks directly into analysis/cpout/masks and import both these folders for the final measurement pipeline. I would make it slightly cleaner but you would loose the probability measurement.
Nevermind, I figured out a way to move the probabilities from pipeline 3 to pipeline 2. But do you remember where the discrepancy between the 2 sets of segmentation masks (the one written in pipeline 2 and the one written in pipeline 3) is coming from?
Hi @nilseling
@probabilities export: I think in the base pipeline this can be savely removed.
@discrepancies in masks: If I remember write the reason is that technically the masks written out in the first pipeline can have some missing object ids. This caused issues downstream (eg HistoCAT).
Thus during the loading of the masks in the measuring pipeline, the setting that objects are re-numbered is used. This guarantees a continuous numbering of objects (so if there are n objects, all object ids 1:n are used).
Generally, I always considered it to be "best practice" to store the exact object masks used together with measurements - just because the measurements are only meaningful with the corresponding masks. Matching masks exported from one pipeline with measurements from another pipeline can be error prone in case that users activate renumbering etc when loading the masks. (P.s.: I always thought the best would be to actually store the pixel coordinates of each object as an entry of the measurement table - unfortunately, at the time I have not seen any of the standardized output formats supporting this.)
Cheers!
Hi @votti
thanks for the reply! I'll keep the probabilities for now.
For the segmentation masks: good point with the continuous cell labels! I do however noticed that when relabelling the cells in the ConvertImageToObjects
call, CellProfiler introduces mislabeled pixels. Out of ~2500 objects around 15 objects are newly labelled objects of size 1-2 pixels. I was able to reproduce this issue with CP v4.2.1 and CP v4.0.7. So I would like to avoid relabelling as part of ConvertImageToObjects
using CellProfiler. I believe these are pixels which are not connected to the main object body anymore.
The FilterObjects
module reindexes objects by default. So missing objects can only come from the ResizeObjects
step (I was able to reproduce this behaviour by shrinking cells before resizing). In any case I wonder why the probabilities are not first downscaled before segmentation? This would avoid losing objects during resizing and facilitate nuclei <-> cell mapping.
I have also checked what CP does with missing object IDs and it fails when measuring objects.
I do understand and agree with the approach to export measurements and masks in the same pipeline but the reindexing in the ConvertImageToObjects
module is buggy. I have filed an issue with CellProfiler.
What do you think?
Out of ~2500 objects around 15 objects are newly labelled objects of size 1-2 pixels.
I just saw now that this can be fixed by increasing the Connectivity in the ConvertImageToObjects
module. Not sure which default setting is sensible though.
This PR has been merged into development
in favor of #87
As discussed, this pull request drops the dependency on
imctools
.Furthermore, this pull request includes some changes requested by @nilseling.
Organizational changes:
imcsegpipe
, which largely covers functionality originally implemented inimctools.converters
, but has been implemented from scratch usingreadimc
for reading MCD/TXT filesenvironment.yml
takes care of installing Python, Jupyter Lab / Jupytext, andimcsegpipe
setup.cfg
(i.e., installed from PyPI)Functional changes:
ome
folder now contains a .csv file for each image, indicating channel names ("metals") and labelsimctools.data
, no JSON serialization ofimctools
objects is createdNotes:
float32
; analysis stacks are saved asuint16
create_analysis_stacks
WIP:
documentation
v3
release, updating the README and the changelog