drivendataorg / concept-to-clinic

ALCF Concept to Clinic Challenge
https://concepttoclinic.drivendata.org/
MIT License
367 stars 146 forks source link

Lung Segmentation #120

Closed vessemer closed 7 years ago

vessemer commented 7 years ago

Prior to stages #1, #2, and #3, CT scans must be preprocessed. Preprocessing involves various steps such as clipping by value, rescaling, and magnitude normalization. The fact that many of the top solutions contain these steps in their pipelines, highlights their importance.

There exists various approaches dedicated to lungs segmentation e.g. via Hounsfield value, affine/rigid/non-rigid image registration, watershed algorithm etc. The tradeoff these approaches usually make comes down to amount of computing resources consuming vs quality of segmentation.

Expected Behavior

Adopted or created lung segmentation algorithm should lie in the src/preprocess/lungs_segmentation.py. It also should be integrated into classes src.preprocess.preprocess_ct.PreprocessCT and src.preprocess.preprocess_ct.Params.

Current Behavior

For now PreprocessDicom doesn't contain any lung segmentation algorithm.

Acceptance criteria

reubano commented 7 years ago

Thanks for this @vessemer. The links you provided were very informative! I learned quite a bit just gleaming through them. My initial thought is, how much of this should be handled by #3 vs common functions that would be shared across #1, #2, and #3?

vessemer commented 7 years ago

@reubano, I guess you're right, it'll be convenient to place the lung segmentation method inside the segment directory. For now, the project has been structured in such a way that all algorithm's subdirectories contain trained_model.py. Perhaps, it's worth to reform the segment part, so that latter will contain lungs, nodules or something similar. Since all stages performed prior to the classification may be roughly counted as preprocessing.

reubano commented 7 years ago

Perhaps, it's worth to reform the segment part, so that latter will contain lungs, nodules or something similar.

Well, we can add appropriately named functions, e.g., lung_affine_segmentation (or whatever would be appropriate). If the trained_model.py file get's too large, we can always split it into 2 files later on.

WGierke commented 7 years ago

@reubano Why has this been closed? Should I discard PR #133 now?

reubano commented 7 years ago

@WGierke hmm, it looked to me like #132 addressed it. Was I mistaken?

WGierke commented 7 years ago

@reubano To me it looks like PR #132 is not directly connected to this issue as the PR is more about implementing ways to distort input data to artificially generate more data to train on. The sentence that references this issue (The aforementioned pipeline mainly consists of Hounsfield scaling, lungs segmentation (#120), CT re-orientation and a batch of trivial, yet resource consumptive spacial operations: zoom, rotation, shear, shift, flip and combinations of them. The convenient solution for the latter process is to build a generator which will yield a processed patches via affine transformations.) sounds to me like the author simply wanted to describe the context in which the data generator fits. Telling from the code, I don't see a possibility to segment lungs - neither in src/preprocess/lungs_segmentation.py (which doesn't exist in the PR) nor anywhere else - or am I overlooking something @vessemer ?

reubano commented 7 years ago

@WGierke good point... it doesn't look like that one does actual segmentation.

vessemer commented 7 years ago

My excuses for a delay, was in a bad condition last week. Thanks for the PR #133, @WGierke!
For sure you're right about the PR #132, I've designed it in order to deal with resource consumptive 3D manipulations. This code doesn't include lungs segmentation in any kind :) Related to this issue, I appreciate the approach of Julian de Wit. However, for the long term, it will be beneficial to deal with the cons of the convenient Hounsfield scale-based lungs segmentation used by de Wit such as instability or trend to false positives. For example, the work of van Rikxoort et al. describes the automatic error detection method via the convex hull complement to a coastline of lungs:

Furthermore, the method provided by S. Hu et al. is aimed at junction line enhancement followed by lungs separation which I've found to be unreasonable resource consumptive though. The ability of the bronchial / lungs separation described in the paper of T. Kitasaka et al. I guess will also be valuable as an additional instrument of data augmentation. All of the aforementioned thoughts are obviously out of scope for this issue. Thus, again, thanks for the PR!

reubano commented 7 years ago

However, for the long term, it will be beneficial to deal with the cons of the convenient Hounsfield scale-based lungs segmentation...

Great observation @vessemer! Do you mind creating an issue with this info? I think it would make a great enhancement that someone can focus on in one of the later milestones.

vessemer commented 7 years ago

Cool, I've opened a new issue #138 dedicated to anatomical structures segmentation for the later milestones.