mapbox / robosat

Semantic segmentation on aerial and satellite imagery. Extracts features such as: buildings, parking lots, roads, water, clouds
MIT License
2.02k stars 382 forks source link

Adds feature pyramid attention (FPA) module, resolves #167 #168

Open daniel-j-h opened 5 years ago

daniel-j-h commented 5 years ago

For #167.

Adds Feature Pyramid Attention (FPA) module :boom: :rocket:

Pyramid Attention Network for Semantic Segmentation https://arxiv.org/abs/1805.10180

fpa-0

from https://arxiv.org/abs/1805.10180 Figure 2

fpa-1

from https://arxiv.org/abs/1805.10180 Figure 3

Tasks

@ocourtin maybe this is interesting to you :)

daniel-j-h commented 4 years ago

By now we have https://arxiv.org/abs/1904.11492 which not only compares various attention mechanisms but also comes up with a framework for visual attention and proposal a new global context block in this visual attention framework.

I've implemented

for my 3d video models in https://github.com/moabitcoin/ig65m-pytorch/blob/706c9e737e42d98086b3af24548fb2bb6a7dc409/ig65m/attention.py#L9-L103

for the 2d segmentation case here we can adapt the 3d code and then e.g. use a couple of global context blocks on top of the last (high level) resnet feature blocks.


attention from https://arxiv.org/abs/1904.11492