ChunkDataPack Malleability

https://github.com/onflow/flow-go/blob/22daf5cdccbe3dc4af7e239495d54ed4c9d304f0/model/flow/chunk.go#L94-L104

The current ChunkDataPack implementation uses the ID() method to return the ChunkID field as the unique identifier: https://github.com/onflow/flow-go/blob/22daf5cdccbe3dc4af7e239495d54ed4c9d304f0/model/flow/chunk.go#L123-L127

This approach assumes that ChunkID is sufficient to uniquely identify a ChunkDataPack. While this method is straightforward, it introduces potential malleability concerns because it does not consider the integrity of the other fields within ChunkDataPack. Specifically, fields like:

StartState,
Proof,
Collection,
ExecutionDataRoot,

are excluded from the identifier computation. This omission means that two ChunkDataPack instances with the same ChunkID but differing values for these fields would produce the same ID, which is incorrect.

Proposed Solution

Key Changes

Update ID() : To address these concerns, the ID() method will be updated to compute the identifier based on the entire ChunkDataPack struct using the MakeID() function:

func (c *ChunkDataPack) ID() Identifier {
 body := struct {
         ChunkID    Identifier    
             StartState StateCommitment 
             Proof      StorageProof    
             Collection Identifier
             ExecutionDataRoot BlockExecutionDataRoot 
    }{

        ChunkID: c.ChunkID,
        StartState: c.StartState,
        Proof: c.Proof,
        Collection: c.Collection.ID(),
                ExecutionDataRoot: c.ExecutionDataRoot,

    }
    return MakeID(body)

}

By encoding all fields, this approach guarantees that the resulting ID is malleability-resistant. Changes to any field of ChunkDataPack will produce a different Identifier.

All fields will be encoded in a way that guarantees consistency:

Primitive and byte array fields: fields like ChunkID, StartState, and Proof are encoded directly as raw bytes using RLP.
Custom structs: Collection -ID() for Collection struct needs to be updated first in #6721. Recursive ExecutionDataRoot struct will be encoded as RLP lists of their public fields, capturing the entire structure consistently (encoding rlp rules).

Remove unused function: During the revision process, it was identified that the function FromChunkID is unused and redundant. This function will be removed. Also Checksum() function will be removed.

Definition of Done

The ChunkDataPack.ID() method has been updated using MakeID. The identifier computation has been verified to use RLP encoding or Fingerprint for all fields.
The unused FromChunkID and Checksum() functions has been removed.
Unit tests have been updated to validate the new behavior, ensuring identifiers change as expected when data is modified.
Documentation and comments have been updated to reflect the changes and clarify the purpose of the ID() method.

onflow / flow-go

[Malleability] ChunkDataPack #6720

ChunkDataPack Malleability

Proposed Solution

Key Changes

Definition of Done