Challenge inference pipeline

qAp / sartorius_cell_instance_segmentation_kaggle

Solution for Sartorius Cell Instance Segmentation Kaggle

0 stars 0 forks source link

Challenge inference pipeline #13

Open qAp opened 2 years ago

qAp commented 2 years ago

[x] From image to instance mask in Notebook.

qAp commented 2 years ago

Should padding be with 0 or 'symmetric'?

qAp commented 2 years ago

[x] Use the GPU.
[ ] Process batches.

qAp commented 2 years ago

Should the semg into WN be just cells, or cells + touching walls? And should it be dilated?

qAp commented 2 years ago

[x] Test time augmentation.

qAp commented 2 years ago

After the Unet, before the WN, several ways of processing semseg into semg have been tried. Looking at the output wngy, at a glance, it appears that simply using semseg[..., 0] as semg resolves more instances than other ways, such as using semseg[.., 0] + semseg[..., 1], or the binary dilation of that.

This kind of makes sense, because when the overlap walls, semseg[..., 1], are added, or when things are dilated, cells that are originally separate in semseg[..., 0] may merge together, making it more difficult for the WN to separate them.

qAp commented 2 years ago

After some rough experimentation:

it appears that TTA helps resolve more cells
watershed cut threshold should be 1
minimum object size = 20 is likely too large, try 1 to 4.
binary dilation selem size. try 2 to 3.

qAp commented 2 years ago

[x] Actually, maybe try increasing the minimum object size, maybe there are too many false positives.
[x] Submit with latest Unet and WN.

qAp commented 2 years ago

Increasing minimum object size to 10 did indeed improve the score, by ~ 0.002.
Training the models more increased the score by 0.025.

qAp commented 2 years ago

[x] Train DN, WTN, and then WN as much as possible. (Unet does not seem to be improving with training anymore.)

qAp commented 2 years ago

Removing small holes doesn't improve the score, so can leave it either on or off.

qAp commented 2 years ago

Including the background in the watershed energy loss has improved the submission score from 0.211 to 0.221

qAp commented 2 years ago

Changing selem from 2 to 3 increases score from 0.221 to 0.235

qAp commented 2 years ago

Following https://github.com/qAp/sartorius_cell_instance_segmentation_kaggle/issues/2#issuecomment-1002644193, the background has been treated as another class in addition to the 'cell' class.

This improved the submission score from 0.235 to 0.237.

qAp commented 2 years ago

When the watershed energy map is cut at a higher level, more instances are obtained. This is helpful in resolving cells that are lengthy and that tend to tangle together with each other.

However, cutting at higher level means that cells become smaller, and selecting a higher binary dilation selem might help compensate for this.

It helps to compare the number of ground truth cells and the number of predicted cells, when selecting the energy level at which to cut. The amount of dilation can be sort of be eyeballed. (Obviously a better way is to compute the competition metric, but at this stage, there's not enough time.)

Trying this to see how the submission fares.

qAp commented 2 years ago

Cutting at level > 1 reduces the score by too much, so not going with this. Perhaps even though more instances are resolved, their shapes have deformed by so much that the IOU suffers. e.g. elongated shapes become nearly square or circular.

qAp commented 2 years ago

For the final submissions, the highest scoring submission, one from background-inclusive and one from background-exclusive workflow, is selected.