wingman-jr-addon / model

The model for filtering NSFW images backing the Wingman Jr. plugin: https://github.com/wingman-jr-addon/wingman_jr
Creative Commons Zero v1.0 Universal
5 stars 0 forks source link

Instructions for generating custom models #5

Closed wingman-jr-addon closed 8 months ago

wingman-jr-addon commented 1 year ago

Over on the Wingman Jr. repository someone asked:

Are there instructions for replacing the model with a different trained model? Something like open-sourcing the tools to generate the model but without the training content?

I requested that the model half of that get split to here.

wingman-jr-addon commented 1 year ago

@anaisbetts Let's continue the discussion here on generating the model.

wingman-jr-addon commented 1 year ago

So to generate the model, there are a few different components as of the latest SQRXR 112. There are not full instructions and the source is not released for many of these parts, but I can briefly describe them and then we can discuss what might be useful. Amongst other things is a suite of hacky .NET tools for working with the dataset.

  1. The dataset. I cannot release the images themselves for copyright reasons; unfortunately, even the URL's may contain sensitive information as a number of them are obtained directly via browsing.
  2. A basic Winforms app for grading input images and creating the graded dataset.
  3. A special branch of Wingman Jr. (https://github.com/wingman-jr-addon/wingman_jr/tree/submit-image) that submits images from a browsing session to a local webserver tied in with the .NET tool. (See https://github.com/wingman-jr-addon/wingman_jr/blob/df0f00ef67c360e1833a872740e880cbafd7bd42/background.js#L219) The webserver ingests the image and stores it to be graded by the Winforms tool.
  4. Once the dataset is graded, the images and a QA csv file are pushed over to the GPU server for training. For me this is an old Supermicro with a K80 in my basement. An important script at this step is a simple one which opens each image using Tensorflow and then blacklists the image if it can't be parsed. Otherwise you get nasty exceptions that can halt training partway through on longer runs.
  5. The primary classification training script itself. Note that this is the piece that has changed most heavily; SQRXR 112 refers to this being the 112th experiment in the SQRXR family of experiments. These are all obviously not checked in, but the script is quite a hodge-podge of things. The output of this is the first stage that produces the SQRX classifier part of the model.
  6. A secondary training script that produces the "R" regression part of the model by optimizing for the ROC AUC using the last feature layer of the main classifier script as an input. Another important output of this script is a CSV that contains the threshold tradeoff for generating the ROC_VALUES in roc.js here: https://github.com/wingman-jr-addon/wingman_jr/blob/master/roc.js Note there is a .NET tool that does the basic transformation into JSON and that also decimates the number of outputs.
  7. A script that fuses together the SQRX and the R part of the models into the final SQRXR model.
  8. At this point the .h5 Keras model is transformed into the graph optimized TF.js model using the basic command line usage of the public TF.js converter tool. e.g. tensorflowjs_converter --input_format keras --output_format tfjs_graph_model sqrxr_plus_best.h5 sqrxr_109b_graphopt
  9. The same web server that provides the image ingestion in 3 also has a mode to view a subset of the dataset along with the true ratings. This can be used in conjunction with the addon to verify that that the new model and the true ratings are in agreement.

There are a bunch of other more specialized tools but these are probably the ones of primary interest. I also did some work on CAM but had some difficulties.

Are there any of these parts that are of interest?