Closed cryptoquick closed 1 year ago
We could use a bitmask matrix to indicate whether a storage format is used to encode something. This would allow skipping certain steps if desired. Then, formats can be referred to by their bitmask. Carbonado 0 means no compression, encryption, stream verification, or error correction. Carbonado 15 would be all of them. If we used a byte in a magic number header, that'd also leave room for future formats. Using a varint instead would futureproof this even more.
Format | Encryption | Compression | Verifiability | Error correction | Use-cases |
---|---|---|---|---|---|
c0 | Marks a file as scanned by Carbonado | ||||
c1 | :white_check_mark: | Encrypted incompressible throwaway append-only data streams such as CCTV footage | |||
c2 | :white_check_mark: | Rotating public logs | |||
c3 | :white_check_mark: | :white_check_mark: | Private archives | ||
c4 | :white_check_mark: | Unencrypted incompressible data such as NFT/UDA image assets | |||
c5 | :white_check_mark: | :white_check_mark: | Private media backups | ||
c6 | :white_check_mark: | :white_check_mark: | Compiled binaries | ||
c7 | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full drive backups | |
c8 | :white_check_mark: | ??? | |||
c9 | :white_check_mark: | :white_check_mark: | ??? | ||
c10 | :white_check_mark: | :white_check_mark: | ??? | ||
c11 | :white_check_mark: | :white_check_mark: | :white_check_mark: | Encrypted device-local Catalogs | |
c12 | :white_check_mark: | :white_check_mark: | Publicly-available archival media | ||
c13 | :white_check_mark: | :white_check_mark: | :white_check_mark: | Georedundant private media backups | |
c14 | :white_check_mark: | :white_check_mark: | :white_check_mark: | Source code, token genesis | |
c15 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | Contract data |
Verifiability is needed to pay others for storing or hosting your files, but it inhibits use-cases for mutable or append-only data other than snapshots, since the hash will change so frequently. Bao encoding does not have a large overhead, about 5% at most.
Any data that is verifiable but also unencrypted is instead signed by the local key. This is good for signed compiled binaries or hosted webpages.
Encoding | Cost | Details |
---|---|---|
Encryption | ~200 bytes | AES-GCM authenticated encryption |
Compression | Variable | -80% for contracts, -20% for code, +~100 bytes if incompressible |
Verifiability | ~5% | Bao encoding |
Error correction | 200% | 4/8 ZFEC encoding |
All formats have a magic number and Carbonado header that includes necessary information for its specific formats.
Each of the four formatting steps should be configurable as to whether they can be used. They should also be built in as conditionally compiled features.
These options should be tracked, perhaps in a compiletime-generated 4-bit bitmask. This can then be added to the magic number, and also the bech32m filename.
This should also make #5 easier to debug.