mapbox / robosat

Semantic segmentation on aerial and satellite imagery. Extracts features such as: buildings, parking lots, roads, water, clouds
MIT License
2.01k stars 382 forks source link

Uses EfficientNetB0 as segmentation model encoder backbone #178

Open daniel-j-h opened 4 years ago

daniel-j-h commented 4 years ago

For https://github.com/mapbox/robosat/issues/172 (see for context) - this implements the EfficientNetB0 model as an encoder for our encoder-decoder architecture.

I'm currently training my EfficientNet model family (no h-swish, no squeeze-and-excitation) in

https://github.com/daniel-j-h/efficientnet

on ImageNet and want to see how they behave as backbone encoder in robosat. I'm inlining the EfficientNet implementation here without the h-swish, scSE, or implant code.

The encoder blocks are tiny compared to the previous ResNet50 blocks. If the EfficientNetB0 features are strong enough we might want to spend some of the resources we gained in the decoder blocks, e.g. res-blocks for learned upsampling or PixelShuffle+ICNR init for learned upsampling, scSE blocks, or simply more features.

Needs thorough evaluation before merging; mainly opening this for visibility.

cc @ocourtin

daniel-j-h commented 4 years ago

Here's the core of this changeset

https://github.com/mapbox/robosat/blob/4a3c1237bb2a0be99da7e17a4e220078ede86233/robosat/unet.py#L122-L157

https://github.com/mapbox/robosat/blob/4a3c1237bb2a0be99da7e17a4e220078ede86233/robosat/efficientnet.py#L91-L103