coreweave / tensorizer

Module, Model, and Tensor Serialization/Deserialization
MIT License
180 stars 25 forks source link

[3.0] Protect against chunk reordering and truncation in ChunkedEncryption authentication #158

Open bchess opened 3 months ago

bchess commented 3 months ago

Leaving this as placeholder for @Eta0 to make accurate description. Copying from https://github.com/coreweave/tensorizer/pull/127#pullrequestreview-2133569874https://github.com/coreweave/tensorizer/pull/127#pullrequestreview-2133569874

Tweak ChunkedEncryption slightly in a way that is incompatible with older deserializers, so I'd like to add it in the new data version 5 alongside the changes in this PR

Eta0 commented 3 months ago

This is referring to resolving a limitation currently listed in docs/encryption.md:

This level of encryption [ . . . ] does not protect against reordering or truncation of chunks.

We can protect against reordering and truncation of chunks by creating a hash list complete with a top hash. We currently use a hash list that is missing a top hash, so individual chunks of a tensor are verifiable, but their count and positions are not.

A good title for this issue would be "Protect against chunk reordering and truncation in ChunkedEncryption authentication."