True counting of objects through video

nikoststs commented 4 years ago

🚀 Feature

Hello , I was wandering if it would be possible to somehow count the total number of objects of a video/multiple frames without double counting an already counter object

Motivation

For a supposed traffic counting application , there are constantly new cars appearing inside the frame and cars leaving. It would be very helpfull to count the total number of cars that have appeared in the video for a given time

Pitch

I would like to see an update where if a car is detected , it is counted only once. So in case it re-appears after some frames or dissappears completely, the counting is not lost. This would require labelling the car and keep info of some its characteristic in order to not take it into consideration any further.

Alternatives

Additional context

github-actions[bot] commented 4 years ago

Hello @nikoststs, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Google Colab Notebook, Docker Image, and GCP Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

glenn-jocher commented 4 years ago

@nikoststs yes several people have asked for this, but it is quite complicated. Each new detection would need to compare to all current trackIDs using some sort of metric function for similarity (possibly using it's latest feature vector), and assigned to an existing track or added to a new track ID. Kalman filters are often used for this.

There are actually entire detection architecture dedicated to this, so it is not a simple change. Deepsort comes to mind.

nikoststs commented 4 years ago

@glenn-jocher Thank you for your fast reply. Could you maybe give me some directions as to where I should start in order to implement such feature?

glenn-jocher commented 4 years ago

@nikoststs well, that depends on if you want to create your own implementation, which is probably a good few months of research and development, or if you want to use an off the shelf solution (i.e. deepsort).

The tracking part is mostly independent of the object detection part. There exist independent trackers like KLT that can operate on bounding box contents. CoreML also offers seperate detect and track functionality.

nikoststs commented 4 years ago

@glenn-jocher I would probably go with an existing solution. So integration of Deepsort is a good approach?

glenn-jocher commented 4 years ago

@nikoststs sure, go for it!

jgladch commented 4 years ago

@nikoststs I am interested in doing the same thing. Can I follow your progress somewhere?

nikoststs commented 4 years ago

@jgladch to be honest, I have dropped the idea for the time being as it was too complicated..

jgladch commented 4 years ago

@nikoststs lol yeah. fwiw, I'm going to try using this: https://github.com/theAIGuysCode/yolov3_deepsort

nikoststs commented 4 years ago

@jgladch thanks for your suggestion, I am gonna give it a try I guess. Have you tried it?

jgladch commented 4 years ago

@nikoststs I'm working on it. Having trouble getting OpenCV to build in a Conda environment within a Docker Container 😅

glenn-jocher commented 4 years ago

@jgladch try docker below

Reproduce Our Environment

To access an up-to-date working environment (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled), consider a:

GCP Deep Learning VM with $300 free credit offer: GCP Quickstart Guide
Google Colab Notebook with 12 hours of free GPU time: Google Colab Notebook
Docker Image from https://hub.docker.com/r/ultralytics/yolov3. See Docker Quickstart Guide

jgladch commented 4 years ago

@glenn-jocher Thanks for the link. I had no problem getting the Ultralytics docker version running and detecting objects. However the Ultralytics repo does not support object tracking from frame to frame, which I need for my use case (counting birds at birdfeeder)

glenn-jocher commented 4 years ago

@jgladch ah no, I just meant you could reuse the image for your own development, without using the repo, as it has opencv and pytorch working, which is what you mentioned you were having issues with.

glenn-jocher commented 4 years ago

@jgladch by the way, bird counting/tracking seems doable accurately only on a very short timespan (i.e. a few seconds), because what happens if the same 10 birds periodically revisit the same feeder throughout the day? I'm assuming each revisit will be counted as an independent bird, artificially inflating the day's tally no? Or in your case you only care about visits, not by how many separate birds?

jgladch commented 4 years ago

@glenn-jocher Agreed, I will be counting "bird feeder visits" rather than "individual birds"

I am able to get the ultralytics/yolov3 repo to produce a series of detection results for a video file. I figure the worst case scenario would be needing to roll my own tracking scripts to run on that output, but I'm still hoping to find someone else's work I can re-use since this appears to be a fairly straightforward use case.

glenn-jocher commented 4 years ago

Well, yes, worst case you do like our iOS app does, and simply count the objects in the video, i.e. '6 items' below. We can actually save this data to show for example a histogram of pedestrians on a street over several hours, or an actual time-series plot of foot traffic over the day.

jgladch commented 4 years ago

@glenn-jocher Thanks so much for the example. I would be happy with the result of your (awesome) iOS app for this use case. When I first looked at your repo, when running detections I was able to create a .txt output of an image, or what appeared to be a series of detections for a video file.

The series of detections though, seemed to be a simple concatenated list of the detections from each individual frame - ie. no way to distinguish between frames. If there was a way to see a per-frame output for a video file I think I could easily build what I'm looking to do. Do you have any suggestion how to go about accomplishing that?

Yeah, now that I think about it... a simple histogram would be great for this purpose.

Thanks a lot for your input here

glenn-jocher commented 4 years ago

@jgladch this section write the results to text files. You should be able to modify it to your needs no, by modifying the filename per frame for example: https://github.com/ultralytics/yolov3/blob/5d42cc1b9a90e26b0b9bffba61fae93f5d1691b9/detect.py#L123-L128

jgladch commented 4 years ago

Thanks @glenn-jocher I really appreciate your help

glenn-jocher commented 4 years ago

@jgladch no problem!

There is definitely interest in tracking above simply detection. It would be nice to incorporate it in a minimal way somehow, so we could simply pass a flag to enable it on videos/webcams/rtsp streams. It would definitely add a layer of value over what we have now.

jgladch commented 4 years ago

If I can find a way to contribute I will! Your suggestion has totally unblocked me, so thanks again.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove Stale label or comment or this will be closed in 5 days.

glenn-jocher commented 11 months ago

@jgladch glad to hear that! Excited to see what you come up with. Thank you for your interest in contributing!

ultralytics / yolov3