microsoft / Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
https://arxiv.org/abs/2103.14030
MIT License
13.55k stars 2.03k forks source link

Is multi-label classification for image classification supported? #265

Open Terry-Kusunoki-Martin opened 1 year ago

Terry-Kusunoki-Martin commented 1 year ago

I am trying to train an image classifier where image ground truth contains multiple classes. Is it possible to train a model that outputs multiple classes?

ancientmooner commented 1 year ago

I am trying to train an image classifier where image ground truth contains multiple classes. Is it possible to train a model that outputs multiple classes?

Yes, it should be not hard to adapt the code to support multiple labels. You need to modify the origial loss to either sigmoid binary cross-entropy loss or softmax based soft cross-entropy loss.

Terry-Kusunoki-Martin commented 1 year ago

@ancientmooner appreciate the reply!

So I would be modifying this block of code here to add a BCE loss function? https://github.com/microsoft/Swin-Transformer/blob/afeb877fba1139dfbc186276983af2abb02c2196/main.py#L109

I also would need to change the output layer to match the number of classes right? Where would I make that modification?