Babies often trigger false positives

wingman-jr-addon commented 4 years ago

Need to add more babies into the training data!

TechnikEmpire commented 4 years ago

That won't work. You need to train an object detector rather than full-context binary classification.

wingman-jr-addon commented 4 years ago

I've been sticking with the full-context approach because an object detector would be unlikely to catch some of the types of NSFW things I'm looking for that are almost entirely relative location-based - this is similar to the types of issues around random crop. For example, a photo of a volleyball player being SFW or NSFW may be primarily in the framing of the photo. I think what could maybe work someday would be to do an object detector with an additional layer looking at relative contexts of the boxes? But for now I'm running MobileNetV2-based network.

Babies do have certain interesting features that are rather unlike adults or even children. Their facial structures are different, their skin is more rounded, and their typical photo contexts often resemble unsafe situations if you squint hard enough. I'll see how far I can get with the full-context approach.

TechnikEmpire commented 4 years ago

I'll let you know what kind of precision I get when I can.

wingman-jr-addon commented 4 years ago

Ok, with the recent update to model SQRX62 I'm planning to close this. To resolve this, I added several thousand baby, pregnancy, and family themed photos for hard negatives. Additionally, I added a significant number of pregnancy themed NSFW images into the mix. Pregnant bellies and breastfeeding photos are treated on more of a case-by-case ranging from fine to NSFW. Overall results have a significant reduction in baby photos and family photos triggering false positives with more mixed results on pregnant belly photos.

TechnikEmpire commented 4 years ago

Are you doing transfer learning? I'm trying to train a binary classifier from scratch (mobilenet v2) and despite having millions of images that are all correctly labeled, I can't get the model to converge. In fact after several hours of training the loss skyrockets. Thanks for any input you have. I am waiting on fine tuning right now to see if it works but I thought I'd ask.

wingman-jr-addon commented 4 years ago

Yes, my released model is transfer learning based on 224x224 MobileNet V2 from TF 2.0's ImageNet tranined one. However, I have also been working on training from scratch as my dataset has grown and it's getting closer to accuracy parity. (I'd like to get to a point of training from scratch so that I can use custom sizes e.g. 300x300.) Training from scratch has generally converged, but it did suffer from more problems when my dataset was smaller and in particular when my LR was too low. Training from scratch seems to work better with e.g. Adam = 0.005 instead of something in the ballpark of Adam = 0.00003 for finetuning. I still haven't done too much training from scratch yet; I only have a TX1 for training so my experimentation is limited.

In both methods, the absolute accuracy is still poorer than I would like, but the generalization is better than accuracy alone would indicate because I use 4 classes instead of a binary classifier (safe, questionable, racy, explicit) along with a custom loss function that penalizes in an asymmetric, gradated way. That may be helping to train to convergence?

wingman-jr-addon / wingman_jr

Babies often trigger false positives #22