libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.79k stars 180 forks source link

Development status and details on the BETON file format. #280

Open FirefoxMetzger opened 1 year ago

FirefoxMetzger commented 1 year ago

Hello ffcv 👋 ImageIO maintainer here.

I just came across the project and I must admit that I am a little jealous. I have always wanted to create an optimized container format for ML on image/video data, and it looks like you guys have already made good strides in that direction 💯. Looking through the project, I am curious about two things: (1) documentation, and (2) maintenance status.

Regarding documentation (1), I would like to know more about the BETON file format. Is there a spec I could read? Strolling through the code, the file format looks to be something like <header><fields_metadata><array of JPEG encoded images>, but that is probably not the full story.

Regarding maintenance (2), I would like to know if there are plans to maintain this package (and the accompanying format) long-term. If you do plan to maintain this long-term, I would be interested in supporting BETON in ImageIO. The main implication of this is that BETON would become more widely available within the python ecosystem. This means easier conversion of image/video data to BETON on the one hand, but also the ability to use BETON in a non-deep-learning context, e.g., as spill-over for queues in a CV pipeline or for applications that choose DASK or other distributed computing tools to optimize their workload.