We're currently using the filesystem blob store to calculate the bao-tree hashes for the file. This is the only way to retrieve the blob hash right now as implementing it directly in an S3 store is tricky.
The downside with this approach is that we are limited by storage: If we're transferring 7 TB of data to a MinIO database (with a lot of storage), then the transport and batching itself will be done efficiently (not much memory will be occupied), but with that extra-step we temporarily need 7 TB of space for the rhio process itself.
Since rhio is currently in an experimental phase, this shouldn't be a real-world issue right now.
In the future we want to do all of this inside of MinIO / S3 and skip loading the file onto the file-system first, this will allow us to keep the resources for the rhio process low (not much storage required)
We're currently using the filesystem blob store to calculate the bao-tree hashes for the file. This is the only way to retrieve the blob hash right now as implementing it directly in an S3 store is tricky.
The downside with this approach is that we are limited by storage: If we're transferring 7 TB of data to a MinIO database (with a lot of storage), then the transport and batching itself will be done efficiently (not much memory will be occupied), but with that extra-step we temporarily need 7 TB of space for the
rhio
process itself.Since
rhio
is currently in an experimental phase, this shouldn't be a real-world issue right now.In the future we want to do all of this inside of MinIO / S3 and skip loading the file onto the file-system first, this will allow us to keep the resources for the
rhio
process low (not much storage required)