All object annotations in CVAT are apparently assigned a unique ID when saved to the server.
It was mentioned in the conversation linked above that it is possible that these server IDs get updated. The reasoning is that it is sometimes more efficient to delete an object and recreate it (thus assigning a new ID) rather than simply updating the existing object.
While this ID reassignment does not seem to raise any issues with operations within CVAT, it results in complications for outside workflows that want to connect with CVAT and maintain knowledge of the specific objects that were uploaded.
Expected Behaviour
The expected behavior would be that once a label/object is uploaded to the CVAT server and is assigned a server ID, that server ID will never be changed unless the user manually deletes the label/object.
Current Behaviour
At the moment, it is possible that the server ID of a label can be modified due to operations other than deletion.
Possible Solution
I am not familiar enough with the codebase to understand all of the operations that can result in the server ID of an existing label/object being modified. However, a solution to this problem should result in a guarantee that once a server ID is assigned, it will never be changed.
I have not been able to reliably reproduce this change in server ID myself, but the possibility of it occurring raises potential issues in the workflows described below.
Context
I am working on an open-source dataset curation and model analysis tool for computer vision called FiftyOne. This tool allows ML researchers and engineers to construct, visualize, explore, and most importantly, store all metadata related to an image or video dataset. Since CVAT is an awesome tool for annotation, I have been working on an integration between FiftyOne and CVAT which allows users to automatically upload data and labels from a FiftyOne dataset to CVAT for annotation/refinement, and then load the updated labels back into the FiftyOne dataset.
The issue of inconsistent server IDs comes into play when specific objects are being reannotated. Each object has a unique ID in FiftyOne that can be tied to the server ID it is assigned in CVAT. This allows us to track exactly which objects are added/modified/deleted and handle each of those three possibilities separately. The problem is that if the server ID is changed even though the object is not deleted, then when loading labels back into FiftyOne, it would seem that the object was deleted even though it was just modified, resulting in undesired behaviors.
This problem is not unique to my use case but would arise in any situation in which someone wants to upload existing objects to CVAT and know exactly which objects were added/modified/deleted.
Your Environment
Git hash commit (git log -1):
Docker version docker version (e.g. Docker 17.0.05): 20.10.7
Are you using Docker Swarm or Kubernetes?
Operating System and version (e.g. Linux, Windows, MacOS): Ubuntu 18.04
Code example or link to GitHub repo or gist to reproduce problem:
Other diagnostic information / logs:
Logs from `cvat` container
My actions before raising this issue
Following the discussion here: https://github.com/openvinotoolkit/cvat/issues/893#issuecomment-962644749
All object annotations in CVAT are apparently assigned a unique ID when saved to the server. It was mentioned in the conversation linked above that it is possible that these server IDs get updated. The reasoning is that it is sometimes more efficient to delete an object and recreate it (thus assigning a new ID) rather than simply updating the existing object.
While this ID reassignment does not seem to raise any issues with operations within CVAT, it results in complications for outside workflows that want to connect with CVAT and maintain knowledge of the specific objects that were uploaded.
Expected Behaviour
The expected behavior would be that once a label/object is uploaded to the CVAT server and is assigned a server ID, that server ID will never be changed unless the user manually deletes the label/object.
Current Behaviour
At the moment, it is possible that the server ID of a label can be modified due to operations other than deletion.
Possible Solution
I am not familiar enough with the codebase to understand all of the operations that can result in the server ID of an existing label/object being modified. However, a solution to this problem should result in a guarantee that once a server ID is assigned, it will never be changed.
@nmanovic suggested that the
bulk_update
operation in Django could allow for efficient updating of objects without needing to delete them.Steps to Reproduce (for bugs)
I have not been able to reliably reproduce this change in server ID myself, but the possibility of it occurring raises potential issues in the workflows described below.
Context
I am working on an open-source dataset curation and model analysis tool for computer vision called FiftyOne. This tool allows ML researchers and engineers to construct, visualize, explore, and most importantly, store all metadata related to an image or video dataset. Since CVAT is an awesome tool for annotation, I have been working on an integration between FiftyOne and CVAT which allows users to automatically upload data and labels from a FiftyOne dataset to CVAT for annotation/refinement, and then load the updated labels back into the FiftyOne dataset.
The issue of inconsistent server IDs comes into play when specific objects are being reannotated. Each object has a unique ID in FiftyOne that can be tied to the server ID it is assigned in CVAT. This allows us to track exactly which objects are added/modified/deleted and handle each of those three possibilities separately. The problem is that if the server ID is changed even though the object is not deleted, then when loading labels back into FiftyOne, it would seem that the object was deleted even though it was just modified, resulting in undesired behaviors.
This problem is not unique to my use case but would arise in any situation in which someone wants to upload existing objects to CVAT and know exactly which objects were added/modified/deleted.
Your Environment
git log -1
):docker version
(e.g. Docker 17.0.05): 20.10.7Logs from `cvat` container
Next steps
You may join our Gitter channel for community support.