Closed Sparrowtech closed 3 years ago
@Sparrowtech , we are going to introduce projects where you can join all similar tasks and after that export images + annotations for the whole project. Will it work for you?
That would be great! Look forward to the update and also would be great to have ability to export the images/annotations not only to one file but also option to export as Train, Test, & Validation sets. Thanks!
@zhiltsov-max, I see that there has been some activity and a "Release" that has been made on this request. Not super familiar with how the Releases are made available from GitHub whether production or for Beta. Really simply question... Is this something I can have access to today or is it being embedded into another release down the road? Please advise if you don't mind. Thanks!
@nmanovic, please, answer here.
@Sparrowtech , the feature will be available in Release 1.0.0 (~ end of February next year). During a week or two first prototype will be merged into develop branch. If you can test the implementation and confirm that it is something useful for you. We don't recommend to use develop in production but internally we use it for our own tasks.
Does it answer on your question?
yes, thank you and will look for feature in the development branch over the next few weeks.
Let's keep the issue till it is resolved.
Currently, it is possible with Datumaro:
Datumaro
format, unpackdatum project create -o proj
datum source add path -p proj -f datumaro_project <path_to_the_unpacked_archive1>
datum source add path -p proj -f datumaro_project <path_to_the_unpacked_archive2>
...
datum project transform -p proj -t random_split [-- -s subset1:ratio1 etc.]
datum project export -f tf_detection_api -p <path_to_transform_result> -- --save-images
Keeping open as a request for:
when can we expect to have the project export
feature in UI? I see this similar request for years, thanks.
Done in #3365
WORKFLOW WORKAROUNDS: We've created individual "Jobs" to represent different classes of objects; i.e. "car, truck, van, helicopter, airplane, etc." largely due to CVAT difficulties-ability to load very large datasets. Each CVAT Job represents ~2500 images and tends to be collectively around 1GB in size between the images and annotations. Currently there are ~ 60 different jobs or classes of objects, 60 GB and ~ 150,000 images.
Routinely we create specific datasets (10-20 object classes or Jobs") which require a lot of post-exporting heavy lifting having to merge tfrecords or xml files into one or batches, not to mention splitting of train/test/val sets. I know that there are a lot of tools out there to help with pre-process and we currently employ many.
Would be ideal to have functionality to choose " Car, Airplane, Helicopter, Bus, ... etc" from the dashboard to EXPORT INTO ONE TASK... AND ability to choose ratio of images to be split into train/test/val sets. e.g. 70% train, 15% test, 15% val. resulting in .zip file(s) with images-annotations or tfrecords created. No extra processing for randomizing, just extract split % from each job and combined for e.g. "Train" insuring well balanced classes rather than relying on function later unknown which is just a random exercise.
Thanks!