Syn-McJ / TFClassify-Unity-Barracuda

An example of using Tensorflow and ONNX models with Unity Barracuda inference engine for image classification and object detection.
MIT License
124 stars 32 forks source link

Input shape Mismatch when converting another keras model to onnx #4

Closed artiransaygin closed 4 years ago

artiransaygin commented 4 years ago

Hi, First of all great project, reall useful. I am trying to do face detection specifically instead of object detection, so I need an onnx tiny yolo model trained for this task. I was able to find a pre-trained keras model so its extension is h5. I saw that you converted tiny yolo model to onnx with the tool called OnnxMLTools. I am using keras2onnx but the input's shape turns out to be (1, 416, 3, 416) instead of (1, 416, 416, 3) and I believe this creates a problem. Can you please explain how to use OnnxMLTools because I wasn't able to figure out how to use it after reading the readme file?

Syn-McJ commented 4 years ago

@artiransaygin I don't think I was using OnnxMLTools myself, I found an onnx model here that is seemingly was converted that way.

I myself use darkflow for training yolo models, it seems to work fine.

As for your input shape (1, 416, 3, 416) - that's a weird shape, but all it means is that you'll need different order of pixels in data that you provide. So if in my TransformInput method I go row by row and order pixels like this: rgbrgbrgb...., then you'll probably need to do rrrr....gggg...bbbbb for each row. However, this is an error-prone process and you'll likely to spend some time trying to make it work, so I would instead try to train a model with normal input shape.

artiransaygin commented 4 years ago

@Syn-McJ Thanks for the explanation. I agree with the fact that trying to adjust the code to make it compatible with the input shape would be really tricky. At the moment, I am training the model with WIDER face training dataset. I might ask for your help again if needed. Thanks a lot.

artiransaygin commented 4 years ago

@Syn-McJ I trained the model and finally I'm getting some detections on Unity. The problem is the bounding boxes look a bit shifted and the confidences are too low, usually around 0.25. I am using my laptop's webcam and not an Android phone. Do you think that might be a reason behind the shifts? And according to what you picked your anchors, should I try changing them? Finally, can I use a yolo model rather than a tiny-yolo model with your project? Huge thanks!

Syn-McJ commented 4 years ago

@artiransaygin Glad you made it work.

The full yolo model should work just as well, but there might be problems with high memory usage on low ram devices.

I haven't tested the example with laptop's webcam, only on a device and boundaries were ok on my android and ios devices. If you have problems with boundaries you could tap into raw percentage results that the model is returning and see why is mismatch happening.

Confidence was generally low, yes, I'm not sure what is the exact reason. It will require some more investigation which I don't have enough time for at the moment. Besides, the example doesn't even work with the latest plugin version, which might indicate at a bigger problem either in my code or in Barracuda.