GantMan / nsfw_model

Keras model of NSFW detector
Other
1.8k stars 279 forks source link

Migrate to Tensorflow 2.1 #51

Closed TechnikEmpire closed 4 years ago

TechnikEmpire commented 4 years ago

In addition, with a simple argument (included in scripts in repo) you can easily convert the emitted model to TFJS using the official converter/optimizer.

I have the fully trained mobilenet model described above temporarily here:

https://technikempire.com/nsfw_model_mobilenet_v2_140_224.zip

Please attach it to a release so everyone can have at it here on the repo.

The ReadMe and the fancy confusion matrix graphics will need to be updated. Also the prediction code probably needs a little work to make it an importable module if you still want that as it was with TF1 version.

GantMan commented 4 years ago

I was looking at the size of the TFJS model - is it much larger bc it's Mobilenet v2?

Should we have a quantized version? Look how small the quantized TFJS model was https://github.com/infinitered/nsfwjs/tree/master/example/nsfw_demo/public/quant_nsfw_mobilenet

I'll wait to hear back before I do the release.

TechnikEmpire commented 4 years ago

Yeah I noticed the size as well. I think its maybe because of all the extra nodes (if you look at the frozen graph in netrron). Seems v2 does a much more complex structure than v1. Definitely worth having the quantized version or exploring additional optimization methods. I had to struggle like mad just to get v2 to export a proper frozen model that would function outside of TF so I'm not sure what more can be done outsize of quant for the js version.

If you can convert the layers to FP16 you can half the model size. That's what I did for my purposes but that's a Intel Openvino model.

mycaule commented 4 years ago

Very instructive Pull Request, great that you migrated the code!

You can also change the depth of the Mobilenet v2 network, as it comes with various sizes (already quantized pre-trained models), see this query on TFHub.

Mobilenets come in various sizes controlled by a multiplier for the depth (number of features) in the convolutional layers. They can also be trained for various sizes of input images to control inference speed. _This TF Hub model uses the TF-Slim implementation of mobilenetv2 with a depth multiplier of 1.3 and an input size of 224x224 pixels. This implementation of Mobilenet V2 rounds feature depths to multiples of 8 (an optimization not described in the paper). Depth multipliers less than 1.0 are not applied to the last convolutional layer (from which the module takes the image feature vector).

You were using imagenet/mobilenet_v2_130_224 (20Mb with a depth multiplier of 1.30 and expected image size of 224x224). To meet @GantMan filesize target, you can change the first value from 1.30 to 0.50 (imagenet/mobilenet_v2_050_224 7 Mb) for example, it may reduce the tfjs model size by a factor of 3.

From what I have learned from the "Tensorflow philosophy", it also looks easier to just reuse quantized pre-trained models, than quantizing your own trained model (requires more advanced knowledge).

It can also be interesting having a Google Colab where to retrain easily since you said it now only requires less than 1 hour compute time on a mainstream CPU, it should then be faster with their free GPU or TPU. The only problem is to have a wget command with the sensitive datasets in that notebook. Ideally users would expect to have the accuracy for every model served in this repo, in fact accuracy decreases with conversion and quantization of the original SavedModel.

TechnikEmpire commented 4 years ago

@mycaule Yeah you make good points. I forgot I used the large model. For clarity, the training ~1 hour is still using the GPU. I was using an RTX 2060. I found that the major bottleneck in training was from the CPU having to constantly resize input images down to 224x224 as it fed the training process.