Closed rcabg closed 5 years ago
Hello!
So the first thing we need to decide is that if we only track in a single direction or in both. For example, we could ask for the person to:
Although option (2) would most likely lead to better results, it would also make things less intuitive. So for now I think we should go with (1). What do you think?
It would also be great if using OpenCV we could read the video directly (convert it into frames) without losing image quality.
Hello!
I think we should stay with option 1. As you say, option 2 gives more flexibility but also increases complexity. I don't know if it really worth it.
Staying with 1, do you find useful to leave the user selecting how many following frames does the user want to "track"? If not, we could leave the program to do its best with the entire video.
I completely agree with reading the video directly. OpenCV allows saving frames for a video file with different formats and compressions (png, jpg...). It is imwrite function.
I think we could use "img_dir" argument or create a new one. If the arg points to a folder, let's stay "normal". If it points to a video file, lets extract and save the files and then go normally. What do you think?
Sorry for the late response. I have been very busy studying for the GRE.
Great comments. I would say:
python main.py -v /path/to/video/
;x
for object i
then all the following frames >x
will label object i
wrongly).Hello there,
Don't worry! Obligations always first ;). I agree with everything, so let get to work.
As you said, just the following images will be labeled. As many frames as possible by default as well.
We should use the arg param '--video' instead for '--v' following the convention of "full-words" that is already used ('--format', '--sort', '--cross-thickness'...), shouldn't we? Also, '--img_dir' and '--video' mustn't be used at the same time. The first step would be storing the images somewhere in order to work with them. That could be a memory problem if the video is really long. Let's see how it works.
For opening the video, OpenCV allows saving the image in different formats (JPG, PNG...). Also, the compression level can be selected. In my opinion, PNG better than JPG since its lossless. I will try it out.
The "deletion the following" feature shouldn't be a big deal since each prediction will be associated in its frame list.
I will be doing some commits in the following days and we can discuss the progressions.
Hello Rafael!
It took me a long time, however, I have just released the video tracker feature! Since I implemented so many changes in this code's structure I had to do this one from scratch.
I added a modified version of your class (LabelTracker
) to the code with a Special thanks to Rafael Caballero Gonzalez
.
Hey!
I'm working on adding the video tracking feature as I mentioned in #3.
I added a new module "box_predictor.py" which tries to be as independent as it can (I'm sure it could be more). It uses the KCF tracker from OpenCV in order to predict the same box in the nex images. It depends on a folder "tmp/" that is created when running. Also, images must be sorted beforehand.
In order to make it work, put some video frames in the images folder. Then start run.py, select the boxes as usual and press key "p" to predict the next N frames. N could be selected beforehand in the new bar introduced.
So far the predicted boxes aren't saved in its main .txt file. I'm looking the way to improve all this process. Also, I think a way to "validate" somehow the predictions is needed before included them in the main .txt.
This PR is just a start point in order to put in common impressions and ideas. Feel free to discuss!