Semantic segmentation (Unet)

qAp commented 2 years ago

SMP example: https://github.com/qubvel/segmentation_models.pytorch/blob/master/examples/cars%20segmentation%20(camvid).ipynb

2018 Data Science Bowl example: https://github.com/selimsef/dsb2018_topcoders/

qAp commented 2 years ago

Training using targets generated by: https://github.com/qAp/sartorius_cell_instance_segmentation_kaggle/commit/e2987eed229d124209319843f51930b114161387

qAp commented 2 years ago

Reduced the probability of albumentations in general, following Selim.

qAp commented 2 years ago

Training in a way that follows more closely what Selim used for the 2018 Data Science Bowl, using 2+ channels and a mix of dice and cross entropy loss.

qAp commented 2 years ago

https://wandb.ai/qap/sartorius_semseg/runs/2p31xrf8?workspace=user-qap

qAp commented 2 years ago

The metrics are better when data augmentation is off: https://wandb.ai/qap/sartorius_semseg/runs/29aw3ihi?workspace=user-qap

Maybe the overlap borders are too thin and can disappear under certain transforms?

Is heavy data augmentation really needed here, given all the images are taken by a lens that is perpendicular to some horizontal glass slide under a microscope?

qAp commented 2 years ago

Decreasing learning rate by 0.2 every 30 epochs, starting from 1e-4, doesn't train well: https://wandb.ai/qap/sartorius_semseg/runs/3ja7fqm9?workspace=user-qap

Seems like a constant 1e-4 is the best.

qAp commented 2 years ago

That switching on data augmentation harms the metric for the first 100 epochs is expected but not necessary a bad thing, just like turning off overfit_batches also harms the metric, because the effective dataset is larger, so one shouldn't expect as good a performance.

That stepping learning rate down from 1e-4 harms the metric when overfitting batches probably suggests it will also do the same when fitting the whole dataset.

Could try:

[x] Check how well OneCycleLR works, as it's seen that reducing lr from 1e-4 harms training.
[x] Plot original semantic segmentation next to an augmented version, to check if in some of the augmented versions, funny things happen, like a border disappearing, etc.

qAp commented 2 years ago

OneCycleLR works just slightly worse than StepLR.

It doesn't look like cells overlap borders disappear completely because of the train augmentation transforms. Nevertheless, another set of semseg is generated with wider overlap orders (square(width=5)). Training using this to see if there's a difference from before.

qAp commented 2 years ago

[x] Try 3-channel, softmax activation.

The SH-SY5Y cells' shapes are such that the background's shape can look very similar to theirs. Maybe by encouraging the model to identify the background as well, it will help overall performance.

qAp commented 2 years ago

[ ] Move prediction code from notebook to package.

qAp commented 2 years ago

[x] Check that data augmentation does not unreasonably make the image quality too bad to spot the cells.

qAp commented 2 years ago

It's sometimes hard to spot certain cells with the naked eye in augmented samples, but in most cases it's fine, so maybe it's ok to include the current augmentation.

qAp commented 2 years ago

[x] Create option to not load pretrained SMP model.

qAp commented 2 years ago

Training with softmax is much better than training with sigmoid. The IOU increases very slowly with epoch. https://wandb.ai/qap/sartorius_semseg/runs/1101flut?workspace=user-qap

The resulting model performs better in general. But it's still evident that it still struggles with SH-SY5Y types.

qAp commented 2 years ago

When the softmax Unet is used for inference, what post-processing can be used? Channel 0 is cell, channel 1 is border, and channel 2 is background. The following are options:

Sum channel 0 and channel 1 to get final semantic segmentation.
Ignore channel 1, because the model can predict overlap border where there really isn't. Dilate the cell segments and use that as the final semantic segmentation.

qAp commented 2 years ago

[x] Build inference script.

qAp commented 2 years ago

[x] How to train to improve the metric? Wandb sweeps?

qAp / sartorius_cell_instance_segmentation_kaggle

Semantic segmentation (Unet) #7