Open lzzhangar opened 7 years ago
I spoke with Frank while I was at WWDC. Frank was the awesome German dude that gave the WWDC talk on vision. He gave some tips. Mostly that the number of distinctive visual features in an object really matters for tracking. Also that the rectangle around the object was important. In this demo, I just draw a square. In reality, you would want the user to be able to trace a rectangle around the object as that would give the vision system a way better clue how to follow it.
I found the vision system really works best with product packaging on a relatively plain background. Like in the video on this post. The product packaging has lots of features and colors and the vision system does a good job tracking it.
I think my overall assessment of it, is that its not as good or as useful as I was hoping. What I really wanted was to be able to find all the objects in a scene and then use coreML to find out the kind of object each thing was. The vision system can't detect objects, it can only follow them once they have been pointed out to the system. I also tried passing CoreML object recognition on the same rectangle that is being tracked and it just didn't work well.
Hopefully I'll have more time to play with it in the future. Or someone way smarter than me can get it to do something really useful :)
Thanks Jeff. Same as you, I am trying to use Vision Framework + CoreML, not only to detect object, but track object as well. Will try more and post here if I find something interesting and get something work well.
@lzzhangar if you clone this repo: https://github.com/jeffreybergier/Blog-Combining-Vision-with-CoreML it has my attempt. The code is a mess and it doesn't work very well. But it might give you a sense of how I approached it.
@lzzhangar @jeffreybergier Agreed on your points here, also trying similar things. Made any progress on multiple object detection and tracking?
@bmgdev unfortunately, I have not (and probably won't) have a chance to look at it again :-/. I'm currently traveling for the foreseeable future and it doesn't leave much time for programming. I'd be curious to know if you find a solution that does object detection and tracking online or if you find a solution.
I found an example from Apple https://developer.apple.com/documentation/vision/tracking_multiple_objects_or_rectangles_in_video ? that works nice with object tracking in video. It would be nice to find some way to replace their video player with AVcapture from camera in this example.
@konsdor wow. this really looks like an excellent resource. I won't have time. But if you're so inclined, you should write up a full tutorial that copies apple's approach as a GitHub repo like this one. I'll definitely put a link to it on this tutorial.
This is a great demo, thanks first Jeff. However, unfortunately, I couldn't get this working the same way as you have in your video. When I tapped to start tracking, the red bounding box just zoomed out, getting smaller and smaller. Sometime, the bounding box disappeared, I had to tap to start tracking again. Is there some known limitation on this demo or vision framework?