Ground/Flood masks TO DO List

melisandeteng commented 4 years ago

Notes ✅ means finished ❌ means canceled

Deeplab models comparison

[x] Compare Deeplabv3+ and Deeplabv2 for ground segmentation
[x] Compare Deeplabv3+ and Deeplabv2 when adding hand rule postprocessing
[ ] Finetune Deeplabv2 for flood segmentation

Mask generator

[x] Training on Watchdogs (WIP)
[x] Add resume training option
[x] Training on Watchdogs + Real
[x] Put depth input as an array, not the image of the inverse log depth
[x] Fix image log

melisandeteng commented 4 years ago

Deeplab versions

Currently for the ground segmentation in images that we are using Deeplab v2 pretrained on Cityscapes. The only available Deeplab v3 pretrained models were trained on Pascal VOC (not interesting for us because there is no ground/ water, it's just persons, cars, ...) except one Deeplabv3+-MobileNet pretrained model on Cityscapes. We will compare this model with the current Deeplabv2 on the images of Canadian landmarks. We will use the same class merging as in here and merge road, sidewalk and terrain.

melisandeteng commented 4 years ago

Deeplab v2 vs Deeplab v3+MobileNet Code in the Deeplabv3 repo We tested a Deeplabv3+MobileNet pretrained on Cityscapes (see this repo on 30 images of Canadian landmarks.

DeeplabV3+mobilenet seems to be performing better overall than v2. It ibetter at going around thin objects (poles, trees) but identifies ground in the sky and bushes. (for which we should try to do some postprocessing) The overlays of masks + images to compare the versions (top image is v2, bottom image is v3 in each image) are here.

Here are some samples (in each pair, top is v2 computed on 256 pixels resolution images, bottom is v3):

vict0rsch commented 4 years ago

Definitely better :)

vict0rsch commented 4 years ago

@melisandeteng can you share fail-cases?

melisandeteng commented 4 years ago

The last one is sort of a failure case.

melisandeteng commented 4 years ago

We actually need to compare versions of masks generated with deeplab that were computed on images of the same resolution. We compute Deeplabv3 on images resized to 513 px _ and compare it with Deeplabv2 computed on images of resolution 600 px _ . Overlays of masks + images are here. V3 seems to leave less room around objects it should contour (especially thin objects) which is good, but sometimes adds more misdetections in the sky. Here are some samples (v2 on top, v3 at the bottom).

Success case:

Failure cases:

melisandeteng commented 4 years ago

We compare Deeplabv2 and Deeplabv3 on 20 images that we labeled with Labelbox. We infer on the images resized so that the longest side is 600 px.

Pixel accuracy: v2: 0.978 v3: 0.973

IOU: v2: 0.917 v3: 0.914

Although the metrics are slightly better with v2, it seems that visually, masks generated with v3 have less obvious holes than those with v2 and can go around thin objects better. See here for the comparison on all the images: in each image, original (ground truth), v2, and v3 are displayed.

Here are some examples :

Where both models fail:

@sashavor @vict0rsch @51N84D what do you think ?

sashavor commented 4 years ago

Is there another metric we can use? like from that paper I found? I'll send it again to ML-core

On Wed, Apr 29, 2020 at 1:40 PM melisandeteng notifications@github.com wrote:

We compare Deeplabv2 and Deeplabv3 on 20 images that we labeled with Labelbox. We infer on the images resized so that the longest side is 600 px.

Pixel accuracy: v2: 0.978 v3: 0.973

IOU: v2: 0.917 v3: 0.914

Although the metrics are slightly better with v2, it seems that visually, masks generated with v3 have less obvious holes than those with v2 and can go around thin objects better. See here https://drive.google.com/open?id=1KiwfQNFwZNVq_pY0RoJxvnDNl-TqHfzA for the comparison on all the images: in each image, original (ground truth), v2, and v3 are displayed.

Here are some examples : [image: image] https://user-images.githubusercontent.com/34208548/80627957-a891c980-8a1e-11ea-9479-d5b6779363e4.png [image: image] https://user-images.githubusercontent.com/34208548/80628170-f1498280-8a1e-11ea-88d5-f3376b429fdb.png

Where both models fail: [image: image] https://user-images.githubusercontent.com/34208548/80628124-e393fd00-8a1e-11ea-8e44-05611e63dea3.png [image: image] https://user-images.githubusercontent.com/34208548/80627834-7c764880-8a1e-11ea-84f0-142972d75db2.png

@sashavor https://github.com/sashavor @vict0rsch https://github.com/vict0rsch @51N84D https://github.com/51N84D what do you think ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cc-ai/kdb/issues/128#issuecomment-621360275, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADMMIITEIQAM5VBUGHG3L2DRPBRB3ANCNFSM4MOEYU6A .

-- Sasha Luccioni, PhD Director of Scientific Projects (AI for Humanity, Mila), Postdoctoral Researcher (UdeM) Directrice des projets scientifiques (IA pour l'humanité, Mila), Chercheure postdoctorale (UdeM) [image: Image result for universite de montreal logo]

vict0rsch commented 4 years ago

The Objective Evaluation of Image Object Segmentation Quality

cc-ai / kdb

Ground/Flood masks TO DO List #128