We could add a synchronize=False mode (referring to syncing data to disk) for incremental writes that defers writing headers until a write occurs with synchronize=True or the file is finalized. For cases like uploading to a temporary file before uploading to S3, writing headers can always be deferred, and this could be better for performance whenever constant synchronization isn't needed since it can guarantee a single write for that entire section, and (potentially) allow less fiddling with encryption information in the header section from having to repeatedly edit data in it (like MACs).
From @Eta0 in https://github.com/coreweave/tensorizer/pull/127#pullrequestreview-2133569874
We could add a
synchronize=False
mode (referring to syncing data to disk) for incremental writes that defers writing headers until a write occurs withsynchronize=True
or the file is finalized. For cases like uploading to a temporary file before uploading to S3, writing headers can always be deferred, and this could be better for performance whenever constant synchronization isn't needed since it can guarantee a single write for that entire section, and (potentially) allow less fiddling with encryption information in the header section from having to repeatedly edit data in it (like MACs).