Need help with the outputs of the uncompiled model

maxbbraun / thermal-face

Fast face detection in thermal images

MIT License

68 stars 13 forks source link

Need help with the outputs of the uncompiled model #6

Closed IvanJuraga closed 4 years ago

IvanJuraga commented 4 years ago

Greetings,

I have downloaded the uncompiled model and tried to run the uncompiled model with the script in this link, but because of some reason I am getting an strange output.

There are 4 outputs with the shapes: ([1, 500,4], [1, 500], [1, 500], [1, 500]) and I don't understand why since in the picture there is just one person.

probica L-1420

These are the 2 pictures I tried the model on. If you can provide a testing script for tf-lite I would be grateful or you can just explain how the output shapes are constructed.

Thank you very much for your time.

maxbbraun commented 4 years ago

This code might be instructive.

maxbbraun commented 4 years ago

I also added the model metadata, which includes the output tensor to representation mapping:

TFLite_Detection_PostProcess bounding boxes
TFLite_Detection_PostProcess:1 class labels
TFLite_Detection_PostProcess:2 class confidences
TFLite_Detection_PostProcess:3 number of boxes

IvanJuraga commented 4 years ago

Hello Max,

First of all I wanted to thank you for the time you invested in trying to solve my problem. Unfortunately I am still struggling to get a good prediction. I load the image using cv2's imread then I convert it to RGB and resize it to (192,192,3) and finally I *rgb_img = (rgb_img - 127.5) 0.0078125**. After that at the end I apply the std and mean from the quantization info from the interpeter. I also load the uncompiled model with the interpeter and feed the tensor to get the prediction but everytime I get the max number of outputs (500) and I try to display them but they are not working for me.

Also I did provide the way I load in the model but if necessary I can do it again. Any help would be great.

maxbbraun commented 4 years ago

Hm, have you tried just feeding in the image directly, like this?

Also, if an image is encoded with a color gradient like your first example above, then you'll probably have to reverse that encoding first and turn it into one channel.

IvanJuraga commented 4 years ago

But in your example your pictures are loaded in 'RGB' format so I didn't know I had to convert it in a single channel. Also to maybe make my life a bit easier is there any way you could share a short script in which you load _thermal_face_automl_edgefast.tflite model and run it with a sample picture which you would share aswell so I could try and replicate the result. That way I would understand it much faster and easier.

As always thank you for your time. Ivan Juraga

maxbbraun commented 4 years ago

The face detector is based on a generic RGB detection model, so it takes three color channels as input. The training set consists of color images mixed with monochrome thermal images. This means that for inference on thermal images you should expect best results when feeding the model with monochrome images converted to RGB by repeating the channel three times. That's what the .convert('RGB') call does in the inference script.

IvanJuraga commented 4 years ago

Hello Max, I managed to run the model but in the end I needed to run the uncompiled model with the edgetpu to get a good output so I would advice you to check if you didn't switch up the models by accident. In the end I got my hands on a coral and I ran the code snippet you gave in the readme using the uncompiled model.

Thank you once again for you time and advices you gave me.

maxbbraun commented 4 years ago

Sounds good! And be sure to use the latest version of the compiled model. You may be running into issue #2, which has since been fixed.