bumble-tech / private-detector

Bumble's Private Detector - a pretrained model for detecting lewd images
https://medium.com/bumble-tech/bumble-inc-open-sources-private-detector-and-makes-another-step-towards-a-safer-internet-for-women-8e6cdb111d81
Apache License 2.0
1.28k stars 95 forks source link

Multi-Label classification #16

Open MartinDawson opened 2 months ago

MartinDawson commented 2 months ago

Hi, Is it possible to extend this to do multi-label classification to detect what type of nudity is shown? Or is it just not designed for that?

Thanks.

Steeeephen commented 2 months ago

Hmm well internally we use a different model for multilabel NSFW image detection, so we've never really needed to use the private detector for anything other than a simple yes/no

I imagine you could use the weights from this model as your base model and finetune a new model on a multilabel dataset, may just need to swap out the classification head after loading the checkpoint. The code should in theory also work for multilabel out of the box though - so if you just want an efficientnet base finetuned on multilabel NSFW, you may just need to swap in your own dataset and run the training script

The input dataset should look something like

{
    "label 1": {
        "path": "/home/sofarrell/private_detector/label_1.txt",
        "label": 0
    },
    "label 2": {
         "path": "/home/sofarrell/private_detector/label_2.txt",
         "label": 1
    },
    "label 3": {
         "path": "/home/sofarrell/private_detector/label_3.txt",
         "label": 2
    },
    ... etc
}