NKI-AI / ahcore

Ahcore is the AI for Oncology core computational pathology toolkit
Apache License 2.0
15 stars 1 forks source link

Intermediate h5 files should be compressed #1

Closed jonasteuwen closed 5 months ago

jonasteuwen commented 11 months ago

Is your feature request related to a problem? Please describe. Currently, to collect whole-slide level predictions and metrics the outputs are written to an intermediate h5 file. This file is not compressed.

Describe the solution you'd like The current file size of the files can be around 15GB, while likely this level (float32) of precision is not required, and 8-bits with an image compression applied should be sufficient. We could consider using imagecodecs

Describe alternatives you've considered N/A

Additional context See: https://github.com/NKI-AI/ahcore/blob/6342fe0d14d7444bc9a9e3856ea6339da58b51ea/ahcore/callbacks.py#L265

jonasteuwen commented 5 months ago

We can now set the precision, which should already compress quite a bit. Closing for until it becomes a new issue.