Cartucho / OpenLabeling

Label images and video for Computer Vision applications
Apache License 2.0
926 stars 265 forks source link

Adding video object tracking with OpenCV trackers #27

Closed rcabg closed 5 years ago

rcabg commented 6 years ago

Hey!

I'm working on adding the video tracking feature as I mentioned in #3.

I added a new module "box_predictor.py" which tries to be as independent as it can (I'm sure it could be more). It uses the KCF tracker from OpenCV in order to predict the same box in the nex images. It depends on a folder "tmp/" that is created when running. Also, images must be sorted beforehand.

In order to make it work, put some video frames in the images folder. Then start run.py, select the boxes as usual and press key "p" to predict the next N frames. N could be selected beforehand in the new bar introduced.

So far the predicted boxes aren't saved in its main .txt file. I'm looking the way to improve all this process. Also, I think a way to "validate" somehow the predictions is needed before included them in the main .txt.

This PR is just a start point in order to put in common impressions and ideas. Feel free to discuss!

Cartucho commented 5 years ago

Hello!

So the first thing we need to decide is that if we only track in a single direction or in both. For example, we could ask for the person to:

  1. label the 1st frame and then use the tracker for the next N frames.
  2. label the (N/2)'th frame and then use the tracker for both the previous N/2 and the following N/2 frames.

Although option (2) would most likely lead to better results, it would also make things less intuitive. So for now I think we should go with (1). What do you think?

It would also be great if using OpenCV we could read the video directly (convert it into frames) without losing image quality.

rcabg commented 5 years ago

Hello!

I think we should stay with option 1. As you say, option 2 gives more flexibility but also increases complexity. I don't know if it really worth it.

Staying with 1, do you find useful to leave the user selecting how many following frames does the user want to "track"? If not, we could leave the program to do its best with the entire video.

I completely agree with reading the video directly. OpenCV allows saving frames for a video file with different formats and compressions (png, jpg...). It is imwrite function.

I think we could use "img_dir" argument or create a new one. If the arg points to a folder, let's stay "normal". If it points to a video file, lets extract and save the files and then go normally. What do you think?

Cartucho commented 5 years ago

Sorry for the late response. I have been very busy studying for the GRE.

Great comments. I would say:

rcabg commented 5 years ago

Hello there,

Don't worry! Obligations always first ;). I agree with everything, so let get to work.

I will be doing some commits in the following days and we can discuss the progressions.

Cartucho commented 5 years ago

Hello Rafael!

It took me a long time, however, I have just released the video tracker feature! Since I implemented so many changes in this code's structure I had to do this one from scratch.

I added a modified version of your class (LabelTracker) to the code with a Special thanks to Rafael Caballero Gonzalez.