Closed vladmandic closed 3 years ago
Hi Vlad, sorry for the late reply.
model works in tfjs in nodejs and browser using webgl like a charm using tfjs 2.6.0!
This is awesome!
checkpoint is the training version and references python variables used in model definition
The default checkpoint and tensorflow saved models (provided in the latest versions) are the inference checkpoints. If possible, can you expand on this.
any chance you can also do a compiled version?
Can you explain a little bit on what is a compiled version (or any links/docs). I might be able to help.
model is very picky about input image resolution performance is pretty low compared to any other object detection model out there by 2-5x? any thoughts? model is very memory hungry - it can easily eat up 2gb of gpu memory to process an image with 1k resolution
The provided model is trained on images with min_side=800 and max_side=1333 (side can be width or height). Since my focus was more on generating the data part, I used the defaults of keras-retinanet.
Training a much smaller and faster object detector (yolov4 tiny, ultralytics yolov5 small) on this data is on my to do list (may be in 5-6 weeks depending on my other responsibilities).
thanks for the comments
Can you explain a little bit on what is a compiled version (or any links/docs). I might be able to help.
since you're loading the model using tf.contrib.predictor
, i assume it's created using estimator
class?
and i have zero experience using estimator
as it's before my time (entire contrib
namespace is obsolete in tensorflow v2 and i've only been using tf for the past few months).
but from what i see, your saved model is just definitions with all the trained data in checkpoint, stored in variables (inside variables/variables.data-00000-of-00001
).
goal is to get to a static saved_model.pb
as a single file that contains all pretrained weights as constants. no clue how.
maybe this? https://www.tensorflow.org/api_docs/python/tf/saved_model/load has a chapter on estimators.
again, i'm just guessing since i've never worked with estimator
or predictor
,
all i know is that resulting checkpoints contains variable references - that is good for training, but less than ideal for running the inference in production
The provided model is trained on images with min_side=800 and max_side=1333 (side can be width or height).
that explains my findings :)
Training a much smaller and faster object detector (yolov4 tiny, ultralytics yolov5 small) on this data is on my to do list (may be in 5-6 weeks depending on my other responsibilities).
nice!
perhaps you'd want to take a look at CenterNet?
It's not as small as YoloV4-Tiny, but its damn fast (by far fastest of all non-trivial models) and very flexible
it's becoming my go-to for any kind of object detection tasks
btw, i've created a simple gist that uses tfjs-node
to showcase nudenet
model (both saved_model
and graph_model
as well as quick bluring of nude parts)
https://gist.github.com/vladmandic/f79c80f83a35d01d9e2df072cf426254
since you're loading the model using tf.contrib.predictor, i assume it's created using estimator class?
The saved model was created from the keras checkpoint at https://github.com/notAI-tech/NudeNet/releases/download/v0/detector_v2_default_checkpoint
Although I haven't tried it out, https://github.com/faustomorales/retinanetjs this repo shows how to convert the checkpoint to tfjs format.
You might also be able to export the keras checkpoint to single saved_model.pb.
perhaps you'd want to take a look at CenterNet?
This is interesting. There also seems to be a CenterNet implementation that will work with my existing training scripts with minimal changes (https://github.com/xuannianz/keras-CenterNet. Are there any other implementations of CenterNet you recommend?
The saved model was created from the keras checkpoint at https://github.com/notAI-tech/NudeNet/releases/download/v0/detector_v2_default_checkpoint Although I haven't tried it out, https://github.com/faustomorales/retinanetjs this repo shows how to convert the checkpoint to tfjs format.
that procedure creates layers model with fixed size - good for classification models, not so good for detection models
maybe section "converting a training model to inference model" from https://github.com/fizyr/keras-retinanet can be used?
that script works with keras_model format (h5), but should be ok to switch to saved_model format (pb)
This is interesting. There also seems to be a CenterNet implementation that will work with my existing training scripts with minimal changes https://github.com/xuannianz/keras-CenterNet.
Are there any other implementations of CenterNet you recommend?
i've been using tensorflow ported version https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md
specifically centernet on resnet50v2 backbone - it's almost as good as resnet101v1 backbone, but smaller and faster
key difference is that it's tpu optimized, althoguh there is still one compatibility issue https://github.com/tensorflow/tfjs/issues/4133
and it requires tfjs 2.6.0 due to variable shape matmul
implementation not present in earlier version
maybe section "converting a training model to inference model" from https://github.com/fizyr/keras-retinanet can be used?
https://github.com/notAI-tech/NudeNet/releases/download/v0/detector_v2_default_checkpoint is exported to the inference format.
Thanks! I will use this.
hmm, i don't understand, i'll dig more.
what i'm talking about is the output of converter when converting your saved_model
to tfjs graph_model
lists this:
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable_1:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable_2:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable_3:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable_4:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable_1:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'Variable_2:0' shape=(9, 4) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
...
(ignore the incorrect variable names - it's an open issue with converter that it mangles them as well as node names)
re: training vs inference model - it might be as simple as running freeze
before saving the model?
this is useful: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md and https://towardsdatascience.com/freezing-a-keras-model-c2e26cb84a38
fyi with few quick questions:
i've downloaded checkpoint as noted in the
detector.py
and converted it totfjs
graph_format
usingtensorflowjs_converter --strip_debug_ops=* --control_flow_v2=* --quantize_float16=* saved/ f16/
(quantized to float16 to reduce size by half)
model works in
tfjs
in nodejs and browser usingwebgl
like a charm usingtfjs 2.6.0
!few comments:
any chance you can also do a compiled version?
it should significantly help with size and speed
i can probably do it as well, but i'd think you'd want to release compiled version for usage and only use dev version for training
any thoughts on that? seems like i get best results if i resize image before inference to a range around 800-1000px
anything smaller than 700px and it misses things badly and anything bigger than 1100px gets a lot of false positives
which unfortunately quickly leads to out-of-memory situations
due to general bad behavior of browser garbage collection of
webgl
objectsthis is by far the most advanced nsfw model i've seen - if it weren't for few issues (performance, memory, resolution sensitivity), it would be perfect!