Yolo Output grid (1,13,13,30) instead of (-1,-1,-1,-1)

Syn-McJ / TFClassify-Unity-Barracuda

An example of using Tensorflow and ONNX models with Unity Barracuda inference engine for image classification and object detection.

MIT License

124 stars 32 forks source link

Yolo Output grid (1,13,13,30) instead of (-1,-1,-1,-1) #3

Closed Pouyan97 closed 4 years ago

Pouyan97 commented 4 years ago

Hey @Syn-McJ, I'm just looking for some advice regarding my model. I tried making my own yolov2-tiny model (on 1 class) to deploy on Unity. I am not an expert on YOLO models. The output grid shape is different than your description on the website, also it is different from the graph you have in your repository. I was wondering how you did the calculations based on the graph you had to be able to get results. I looked at the microsoft website to get more familiar with their procedures, however, calculations done in Unity is a bit different.
As of now, the result I am getting for confidence or results is NaN which is most likely due to miscalculations. My whole model is different, but it works when I use darknet detector source code with an image. Do you have any idea or resources that can help me?

Syn-McJ commented 4 years ago

@Pouyan97, If your output is (1,13,13,30) than it seems ok, YOLO output depends on the number of labels you have. You can check this post for some details on how it works. 1 class in your models means the last dimensions should be 5 x (5 + 1) which is 30.

(-1,-1,-1,-1) isn't a typical output shape, but some models show them which doesn't necessarily mean that they won't work, the real output shape will be different anyway.

I'm not an expert on YOLO myself, but I would recommend trying your model with python on some well-tested inference code first to make sure it's 100% correct. Also, I myself prefer to work with darkflow to train tiny-yolo2 tensorflow models.

Pouyan97 commented 4 years ago

@Syn-McJ , Thank you I realized the 30 being the filters I set myself right after posting the question :)). The links are very helpful.

I have tested my model's weight however translating it to ONNX might have caused some problems. I can try training with Darkflow for comparison. Thanks again.

artiransaygin commented 4 years ago

@Pouyan97 Have you solved the problem with your model? Currently, I am trying to use a tiny yolo model trained for face detection, however I am not getting any results. Unity has also been giving two warnings: "adjusted_input8: Unknown type encountered while parsing layer adjusted_input8 of type Transpose. We replace by an identity layer." and "transpose_output: Unknown type encountered while parsing layer transpose_output of type Transpose. We replace by an identity layer.". I think these warnings might be the reasons for my current problems. Any thoughts, or by any chance have you encountered this kind of warnings? By the way my input shape is (1, 416, 3, 416) so that might be an issue as well. Would really appreciate any thought, comments etc.