Open ajrup opened 6 months ago
Hi, you can specify a subset name for tasks in CVAT. CVAT will still merge annotations for tasks in the same subset, so it will be a problem if you're creating tasks from videos (the frames will have the same names). However, if you assign each task a separate subset name, annotations and images will be exported in different directories (after https://github.com/cvat-ai/cvat/pull/8171) in the project export. Then, if you want to shuffle images, you'll have information about the source video (which will be the subset name).
Actions before raising this issue
Is your feature request related to a problem? Please describe.
I am building a dataset from a number of different videos. The hierarchy is as such:
When exporting this dataset, I would like to be able to know which video a frame was exported from (e.g. a.mp4, b.mp4, etc). I need to know this information to ensure, when I am splitting in to test/train/validation, that I can keep an entire mp4 file in one split. Additionally, when performing an analysis on the dataset (data mining of sorts) this information proves extremely valuable.
Describe the solution you'd like
Here are some ideas:
"{job_name}_*"
Describe alternatives you've considered
I have considered exporting the dataset of all of the tasks for a given project but using the python SDK to automate renaming them to the task name, then merging them all together. This is doable but will require a custom script instead of just being able to use CVAT.
I've also considered a free form text box attribute on a tag, but this then puts the onus on the annotator to duplicate the file name correctly and is not supported by all dataset formats.
Additional context
No response