Closed vvkv closed 6 years ago
I'm not sure where this goes wrong or why but an easy fix is to flip the coordinates of the bounding boxes too: box.x = screen width - box.x.
That sounds like a neat solution, it would be great if you could advice on where this code modification is to be made? I went through most of the .swift files but wasn't able to find a "box.x" variable. Thank you
On this line, https://github.com/hollance/YOLO-CoreML-MPSNNGraph/blob/master/TinyYOLO-CoreML/TinyYOLO-CoreML/YOLO.swift#L94
change it to:
let x = blockSize - ((Float(cx) + sigmoid(tx)) * blockSize)
This happens because the image from the front camera is mirrored. There are other solutions to this, but mirroring the x coordinate of the bounding box is probably easiest (and fastest).
I think you need to subtract from the width (416), not from block size in order to mirror. This line did it for me:
let x = 416 - ((Float(cx) + sigmoid(tx)) * blockSize)
Thank you for the great work here. I am constantly trying to experiment with variations of your work for my own purposes and for one of my applications I am trying to use the object dectection through my front camera (on my iPad) so the I can see what being detected while moving in front my my screen. I have made the following change in VideoCapture.swift to capture the video stream through the front camera:
Original code chunk:
My changed chunk:
All works well and good except that the bounding boxes are mirrored, meaning that if I am holding an apple in my left hand and a banana in my right hand, the boxes are swapped around these objects. I understand that this might have to do something with the iPhones front camera mirroring the video display but I am not sure how to tackle this issue. Any guidance will be very useful. Thank you