hollance / YOLO-CoreML-MPSNNGraph

Tiny YOLO for iOS implemented using CoreML but also using the new MPS graph API.
MIT License
935 stars 252 forks source link

Showing only one bounding box per frame (coreml) #49

Open 1omarsaid opened 5 years ago

1omarsaid commented 5 years ago

I am trying to find a way to only show one bounding box per frame, meaning that I dont want it to detect three "persons" in one frame. In the view controller I have modified the maxInflightBuffers to 1 but I am not sure if that is the way to go. The reason of such implementation, I am enabling an audio file when a certain bounding box appears, but if there is more than one, it wont play.

hollance commented 5 years ago

That is unrelated to the maxInflightBuffers. You need to change the nonMaxSuppression code to sort the bounding boxes by confidence and then pick the one with the highest confidence.

1omarsaid commented 5 years ago

How can I do that? so far I just have it so it shows the bounding boxes with a prediction above 50%

1omarsaid commented 5 years ago

@hollance I would really appreciate it if you can give me some guidance into implementing this. Thank you in advance.

hollance commented 5 years ago

There’s not much to it: when you get the bounding boxes, sort them by confidence score and keep the one with the highest score.

1omarsaid commented 5 years ago

What if I just do something like this "return nonMaxSuppression(boxes: predictions, limit: 1, threshold: iouThreshold)" would this be wrong?

hollance commented 5 years ago

That's fine but a little bit slower than my previous suggestion.

1omarsaid commented 5 years ago

How would I be able to reduce the number of bounding boxes (per second) without reducing the "Desired frame rate"

hollance commented 5 years ago

You would have to change the model for that. It always predicts the same number of bounding boxes in every frame.

1omarsaid commented 5 years ago

sorry what I meant was, how can I have a tiny delay between each bounding box, so if I want to associate a bounding box with a half second audio, it doesnt cut off because it shows a new bounding box that also activates the audio file

1omarsaid commented 5 years ago

Do you know if there is a way to do that? I tried setting up delays of when they would trigger the "show" function but it doesnt seem to work.

hollance commented 5 years ago

Every new video frame gives completely new predictions. There is no way to track an object across multiple frames with YOLO. So I’m not really sure what you’re trying to accomplish but it sounds like it needs more than just bounding box detection.

1omarsaid commented 5 years ago

I see, so there isnt a way to be able to only get predictions on every other frame?

hollance commented 5 years ago

If you want predictions every other frame, then simply keep a counter and only do a prediction when the counter is even.

1omarsaid commented 5 years ago

would the counter be in the viewcontroller? Please give me some tips in how I can execute that? what if I want to make it make a prediction every n frames?

1omarsaid commented 5 years ago

which function does the actual prediction? I can wrap it with an if statement this way.

1omarsaid commented 5 years ago

I created an if statement around


      } else {
        boundingBoxes[i].hide()
      }```

it seemed to do the job