Version identifier for encrypted messages

Rationale

When the underlying encryption from shlib/crypto changes, we have so far no way to identify that during decryption, other than trying with a different version. In order to facilitate crypto updates in the future, we should introduce a crypto version identifier (VI) that will be serialized with the encrypted messages.

Requirements

An encrypted message with VI should be detectable as such.
An encrypted message with VI should be easy to discriminate against one without it.
The VI should add minimal size overhead.
The VI should add minimal parsing overhead.
Serialization/Deserialization of the VI should add no extra external dependencies to the software.

Discussion

Using one byte for the VI will allow for 256 future versions, which should be future proof and satisfy 3).

Serialization

Messages of unknown heritage are to be expected mostly over the wire in serialized form, therefore special consideration should be taken to help with deserialization.

Using a version prefix seems to be the best general approach to achieve 1) and 2). Since EncryptedMessage is padded by Block size, we can generally discriminate 2) by length, at least in the happy path scenario (assume and parse a prefix and find a complete and valid message in the remainder). This will however lead to some overhead, when revisiting a message that does not fit the new format.

One option to signify the VI in serialized messages could be to use a different encoding for the version byte , e.g. hexrot, where every hex character is mapped to a rotated ascii character, e.g. 0=>k, 1=>l, 2=>m, 3=>n, 4=>o, 5=>p, 6=>q, 7=>r, 8=>s, 9=>t, a=>q, b=>r, c=>s, d=>t, e=>u, f=>v. This has the benefit of identifying the VI by looking at the first character of a serialized message (has_version_identifier = msg[0] in "klmnopqrstuv"). This would allow to avoid the overhead of retrying deserialization when length does not match. However, it contradicts the requirements 4) and 5) since it adds overhead for rotating the first byte and requires implementation of a non standard encoding.

Another option is the addition of a delimiter, such as :, which adds some extra size 3), but is otherwise a good solution for 1) 2) 4) and 5).

However, discrimination of unprefixed messages is only a transitional concern, therefore we can put more weight onto 3) 4) and 5) and rely on padding/length for 1) and 2).

Implementation

The version identifier needs to be available for serialization and deserialization, so coming versions of shlib/shcrypto must specify it.

The version identifier needs to be used during deserialization, to chose the correct version of shlib/shcrypto. For applications that support multiple crypto versions, we should assume, that multiple dependencies are bundled and the application chooses the correct decryption function after extracting the VI. ~~Therefore it seems best handle the VI deserialization outside the crypto library.~~ shlib/shcrypto should allow to parse the VI in a future proof way, so version detection can be done by the "primary bundled version". During decryption, it should fail early, if the prefix does not match the specified version.

shlib/shcrypto should prepend the serialized message during (or after) marshalling. ~~Alternatively, the application could add the version identifier in the enclosing scope. However, this would lead to more fragmentation of the implementations and is therefore the inferior solution~~. The internal representation of EncryptedMessage does not need to add the VI.

Specification

shlib/shcrypto adds a one byte VERSION_IDENTIFIER, which will start at 0x01 for a re-release of the current version 0.1.16 as shlib/shcrypto@v0.1.17.

Further releases that change the cryptographic implementation affecting EncryptedMessage or its marshalling, will increase the VERSION_IDENTIFIER by 1.

Previous "legacy" versions can be considered to have an implicit VERSION_IDENTIFIER of 0x00.

The version identifier is a non delimited prefix to the serialized message format.

shlib/shcrypto needs to add the following new functionality:

//// shlib/shcrypto/version.go
// version_identifier is used to prefix encrypted messages when sent over the wire
const byte VersionIdentifier = ...

//// shlib/shcrypto/encoding.go

// IdentifyVersion reads the version identifier byte from the given (marshalled) EncryptedMessage.
func IdentifyVersion(d []byte) byte {...}

// Marshal serializes the EncryptedMessage object. It panics, if C1 is nil. The first byte is `version_identifier`.
func (m *EncryptedMessage) Marshal() []byte {
...
    buff := bytes.Buffer{}
        buff.WriteByte(VersionIdentifier)
    buff.Write(m.C1.Marshal())
...
}

// Unmarshal deserializes an EncryptedMessage from the given byte slice. If the first byte does not match `version_identifier`, it returns an error.
func (m *EncryptedMessage) Unmarshal(d []byte) error {
  if d[0] != VersionIdentifier {
    return fmt.Errorf("version mismatch: got %d need %d", d[0], version_identifier)
  }
...
}

Follow up issues to create

[ ] use VersionIdentifier in json tests
[ ] use VersionIdentifier in shutter-network/rolling-shutter

shutter-network / rolling-shutter