openvinotoolkit / datumaro

Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
https://openvinotoolkit.github.io/datumaro/
MIT License
530 stars 130 forks source link

Support arbitrary image extensions in formats #166

Closed zhiltsov-max closed 3 years ago

zhiltsov-max commented 3 years ago

Most of formats do not specify the image format they use. Some do, but it is very useful for the framework to support arbitrary image extensions to avoid useless reconverting between formats on exporting and to relax requirements on importing datasets.

Formats:

mrd commented 3 years ago

I found that PNG files were still hardwired into the CVAT converter. To fix this I used glob.glob. Before I submit a pull request can I make sure that import glob is acceptable in the cvat plugin, or would you rather it be done another way?

changes: https://github.com/openvinotoolkit/datumaro/compare/develop...AdaptiveCity:arbitrary-image-format-cvat-converter

zhiltsov-max commented 3 years ago

@mrd, good observation. I'm not convinced it is reasonable to use lossy formats for video frames, but it definitely should be allowed. Globbing is allowed, but I'd suggest to change your implementation to use the approach from other formats to avoid repeated OS calls. You're welcome to open a PR!