cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.48k stars 2.99k forks source link

Tracking functionality for bounding boxes #211

Closed savan77 closed 4 years ago

savan77 commented 5 years ago

Hi, We are adding several features into CVAT and will be open-sourced. We might need your advice along the way, just wanted to know if you can help. Currently, we are trying to change the interpolation. As of now, interpolation just puts bounding box in the remaining frames at the same position as it is in the first frame. We are trying to change that and add tracking there. Since the code base is huge I am unable to understand the exact flow of process.

For now, say instead of constant coordinates I want to shift box to right a little bit (i.e 10 pixels). I guess its trivial task. Just need your help regarding the same, if possible. Thanks

bsekachev commented 5 years ago

@savan77 If I understood you right, you need "shapes.js" file, "BoxModel" class and "_interpolatePosition" method. This method computes an interpolation for a bounding box. All shape instances consist a field "this._frame" which is start frame for an interpolation shape. Method has an argument "frame" which is frame for an interpolation. From these data you can compute right offset on each frame.

nmanovic commented 5 years ago

Hi @savan77,

I talked with Donald and Rush. Welcome to our community!

As of now, interpolation just puts bounding box in the remaining frames at the same position as it is in the first frame.

Indeed interpolation works between key frames. It is not only "puts bounding box in the remaining frames at the same position". If you draw a bounding box on first frame it will be propagated as is on next frames. If you change the propagated bounding box on 10th frame all frames between 1st and 10th will be changed (linearly interpolated). It works well for DSS annotation tasks with static cameras.

We are trying to change that and add tracking there.

Tracking is a good feature which will be useful in many scenarios. How are you going to add it? Will it work on clients only using OpenCV.js or it will be a solution on server-side using a DL model?

Just need your help regarding the same, if possible.

Interpolation exists in two places: client (draw bounding boxes between key frames) and server (dump interpolated boxes).

savan77 commented 5 years ago

@nmanovic Thanks for the information. Currently, we are planning to use trackers provided by OpenCV, so I guess that won't create any problems. It is shipped with opencv-contrib.

nmanovic commented 5 years ago

Hi @savan77, does it mean that you will support all trackers which are supported by OpenCV (including GOTURN)? As far as I can understand it will be server side implementation. How are you going to enable tracking for an object? I just want to understand how it will look like for an user. Will it be a global option in settings? For example, tracking: "interpolation" (by default), "GOTURN", "TLD", "MedianFlow", ... Or are you going to allow choose the method per track/object?

rushtehrani commented 5 years ago

@nmanovic those are great questions. It will be a global setting for the first phase.

We are currently targeting to support GOTURN. @savan77 and I will coordinate to see what it takes to support any OpenCV compatible tracker. Let us know if you have any ideas on a more generic implementation.

savan77 commented 5 years ago

@nmanovic Yes, as of now, we are planning to have interpolation work as a tracker. We haven't decided whether we will give an option to choose tracking method. Ideally, we'll have tracker with the best performance as a default. But you can use any opencv-tracker you want including GOTURN. Based on our experiements, GOTURN should give better results but anyhow boxes explode while using GOTURN. However, CSRT gave good results.

nmanovic commented 5 years ago

Hi @savan77 ,

Do you still want to delivery the feature? If so when?

bsekachev commented 5 years ago

There are some implementation ideas in the duplicated request #38

savan77 commented 5 years ago

Hi @nmanovic, @bsekachev ,

I and @rushtehrani are coordinating to implement this feature. So far, we have implemented the tracking part but it is yet to be integrated with CVAT. I will discuss with @rushtehrani to come up with a rough timeline. Thanks

nmanovic commented 5 years ago

Hi @savan77 ,

Don't hesitate to discuss with us any preliminary ideas, prototype. We can test the feature internally and provide an early feedback. We implemented many annotation tools in the past and some of them had tracking feature. The main difficulty here is the client-server architecture of the app. Need to provide smooth experience for users. In general they should think that the tracking is working in real-time on the client side but it is the final goal.

I have in my mind the following implementation. What do you think?

On the server we should have 1 entry point:

On the client we should have the following behaviour:

savan77 commented 5 years ago

Hi @nmanovic,

Thanks for the input. That looks like a potential process flow for the tracking. We are actually in the middle of something else. We won't move forward without your consent on the process flow. So, I will get back to you in few days and we can discuss more about the flow.

nmanovic commented 5 years ago

Onepanel.io is not going to implement the request for now. I will move the on the next release. Hope we will enough time to implement the feature. It should not be very complex.

mistermult commented 5 years ago

@nmanovic @savan77 @bsekachev I hacked a first version of the tracking feature together. It works as expected. Now want to polish it up for a pull request. It has the following flow:

I found this very useful in the following situation: Create a keyframe when the object enters and leaves the image. Press tracking. In the middle the bounding box is often not accurate enough, so correct it (byMachine=false). Press tracking again. The bounding boxes from the middle user defined keyframe to the last user defined keyframe are now updated. To do this, we need to know which frames are from the user or the machine, hence the new attribute byMachine on each frame.

So I need the following modifications in the existing code:

Code structure:

I looking forward to your answers for all the question.

nmanovic commented 5 years ago

@mistermult , missed the comment. Just submit a PR and I will recommend something. First of all need to check it from user experience point of view. When it is fine I will recommend how to change code to make it better.

nmanovic commented 5 years ago

No support from community. We will implement the feature internally. Moved to next release.

nmanovic commented 5 years ago

One more possible tracker: https://github.com/foolwood/SiamMask

zirus23 commented 5 years ago

Do we have any updates on this? Is there any code that I could use to add tracking (not interpolation) to CVAT like there is with other labelers (e.g. OpenLabeling)? I'm fine with using an optical flow based tracker or some other standard tracking method (e.g. CSRT), I don't think a Siamese network is necessary or optimal for my application.

nmanovic commented 5 years ago

@zirus23 , community promised several times to contribute the future but it is still not implemented. Our team has plans to allocate resources and implement it in 1.0.0. I hope to have the feature till EOY. There are actually two major plans:

lNicolasl commented 4 years ago

Hi,

Just for information, "siammask_e" seems to be also a good video tracker candidate :

nmanovic commented 4 years ago

Atom tracker from CVPR2019: https://arxiv.org/abs/1811.07628

korabelnikov commented 4 years ago

@nmanovic Hi Can you share when you plan release with tracking functionality?

nmanovic commented 4 years ago

@korabelnikov , for now it is planned for release 1.0 but I don't think we have enough resources to implement it in the upcoming release. If you can help and propose a PR we will be glad to review and merge.

korabelnikov commented 4 years ago

Thanks for info

ksenyakor commented 4 years ago

Could you please explain, how to use tracking tool in CVAT? There is demo video https://www.youtube.com/watch?v=Rjf9IRmk_o4&feature=youtu.be But I don't have "Track" in the context menu....

nmanovic commented 4 years ago

@ksenyakor , the tracking feature isn't integrated yet. The PR is here: https://github.com/opencv/cvat/pull/1003

Probably it will be integrated only after our next release in February. If you like you can manually apply the patch on your local git copy.

ksenyakor commented 4 years ago

@ksenyakor , the tracking feature isn't integrated yet. The PR is here: #1003

Probably it will be integrated only after our next release in February. If you like you can manually apply the patch on your local git copy.

Thanks for your answer!

hasanahmedfaisal commented 4 years ago

Hi @nmanovic

Indeed interpolation works between key frames. It is not only "puts bounding box in the remaining frames at the same position". If you draw a bounding box on first frame it will be propagated as is on next frames. If you change the propagated bounding box on 10th frame all frames between 1st and 10th will be changed (linearly interpolated).

Cloud you please share any details on how this interpolation is done , any code or any articles , is it done using opencv , and are bounding boxes edge points used or centre point, scale , aspect ratio are used ,

please please share , i would like to implement this myself with few changes i am thinking of , So please help Any one if you know please share

Thank you

korabelnikov commented 4 years ago

why don't look sources?

hasanahmedfaisal commented 4 years ago

Hi @korabelnikov , Do you mean source code of cvat , i tried to look into it , as per my search , most code is return in JavaScript and i donot know java script , so having hard time understanding

If you cloud point me any sources in python , please share , or any other advice please share

Thank you

kaustubhharapanahalli commented 4 years ago

@nmanovic Hi, Any progress on this?

mingweihe commented 4 years ago

@nmanovic Thank you for the hard work. I was recently trying to deploy this function using the nuctl command "nuctl deploy --project-name cvat --path serverless/pytorch/foolwood/siammask/nuclio", it seems that it is not working. So may I happen to know if this function is ongoing now? error message is as follows:

Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use
#
#     $ conda activate siammask
#
# To deactivate an active environment, use
#
#     $ conda deactivate

==> WARNING: A newer version of conda exists. <==
  current version: 4.8.2
  latest version: 4.8.3

Please update conda by running

    $ conda update -n base -c defaults conda

Removing intermediate container 4952a8bb92d5
 ---> 18fcb57d7c72
Step 7/20 : RUN source activate siammask
 ---> Running in 4bb228f86d01
/bin/sh: 1: source: not found
Removing intermediate container 4bb228f86d01

stderr:
The command '/bin/sh -c source activate siammask' returned a non-zero code: 127

    /nuclio/pkg/cmdrunner/cmdrunner.go:124
Failed to build docker image
    .../pkg/containerimagebuilderpusher/docker.go:56
Failed to build processor image
    /nuclio/pkg/processor/build/builder.go:250
Failed to deploy function
    ...//nuclio/pkg/platform/abstract/platform.go:171
483415258 commented 3 years ago

@korabelnikov @rushtehrani @savan77 @ksenyakor @kaustubhharapanahalli Hello everyone, I have encountered the same problem as mingweihe above. How can I solve it

Thank you for the hard work. I was recently trying to deploy this function using the nuctl command "nuctl deploy --project-name cvat --path serverless/pytorch/foolwood/siammask/nuclio", it seems that it is not working. So may I happen to know if this function is ongoing now? error message is as follows:

Verifying transaction: ...working... done Executing transaction: ...working... done #

To activate this environment, use

#

$ conda activate siammask

#

To deactivate an active environment, use

#

$ conda deactivate

==> WARNING: A newer version of conda exists. <== current version: 4.8.2 latest version: 4.8.3

Please update conda by running

$ conda update -n base -c defaults conda

Removing intermediate container 4952a8bb92d5 ---> 18fcb57d7c72 Step 7/20 : RUN source activate siammask ---> Running in 4bb228f86d01 /bin/sh: 1: source: not found Removing intermediate container 4bb228f86d01

stderr: The command '/bin/sh -c source activate siammask' returned a non-zero code: 127

/nuclio/pkg/cmdrunner/cmdrunner.go:124

Failed to build docker image .../pkg/containerimagebuilderpusher/docker.go:56 Failed to build processor image /nuclio/pkg/processor/build/builder.go:250 Failed to deploy function ...//nuclio/pkg/platform/abstract/platform.go:171

Teagueporter commented 2 years ago

Hi I'm pretty new to using cvat as a tool, and right now I'm trying to create bounding boxes for fish in an ocean, the main thing is the fish will leave the screen or hid behind something for a few frames, but the problem that I ran in to was I delete the bounding box when they left and that just deleted it for every frame, the other was I hid the box but that too hides it for all frames. And I would rather not have a box just floating around while the fish is not in the frame.

By the way I'm using the open cv tracking tool

bsekachev commented 2 years ago

Hi I'm pretty new to using cvat as a tool, and right now I'm trying to create bounding boxes for fish in an ocean, the main thing is the fish will leave the screen or hid behind something for a few frames, but the problem that I ran in to was I delete the bounding box when they left and that just deleted it for every frame, the other was I hid the box but that too hides it for all frames. And I would rather not have a box just floating around while the fish is not in the frame.

By the way I'm using the open cv tracking tool

Hi, @Teagueporter

"Outside" feature is exactly what you need. You can hide track starting from frame N, and show it again starting from N+M. To do that please use setting: "Show all interpolation tracks". Or another way is to create two independent tracks and merge them, using "Merge" feature.

Please, refer to the documentation for details: https://openvinotoolkit.github.io/cvat/docs/manual/basics/track-mode-basics/ https://openvinotoolkit.github.io/cvat/docs/manual/basics/settings/