This approach assumes that ChunkID is sufficient to uniquely identify a ChunkDataPack. While this method is straightforward, it introduces potential malleability concerns because it does not consider the integrity of the other fields within ChunkDataPack.
Specifically, fields like:
StartState,
Proof,
Collection,
ExecutionDataRoot,
are excluded from the identifier computation. This omission means that two ChunkDataPack instances with the same ChunkID but differing values for these fields would produce the same ID, which is incorrect.
Proposed Solution
Key Changes
Update ID() :
To address these concerns, the ID() method will be updated to compute the identifier based on the entire ChunkDataPack struct using the MakeID() function:
By encoding all fields, this approach guarantees that the resulting ID is malleability-resistant. Changes to any field of ChunkDataPack will produce a different Identifier.
All fields will be encoded in a way that guarantees consistency:
Primitive and byte array fields: fields like ChunkID, StartState, and Proof are encoded directly as raw bytes using RLP.
Custom structs: Collection -ID() for Collection struct needs to be updated first in #6721. Recursive ExecutionDataRoot struct will be encoded as RLP lists of their public fields, capturing the entire structure consistently (encoding rlp rules).
Remove unused function:
During the revision process, it was identified that the function FromChunkID is unused and redundant. This function will be removed.
Also Checksum() function will be removed.
Definition of Done
The ChunkDataPack.ID() method has been updated using MakeID.
The identifier computation has been verified to use RLP encoding or Fingerprint for all fields.
The unused FromChunkID and Checksum() functions has been removed.
Unit tests have been updated to validate the new behavior, ensuring identifiers change as expected when data is modified.
Documentation and comments have been updated to reflect the changes and clarify the purpose of the ID() method.
ChunkDataPack Malleability
https://github.com/onflow/flow-go/blob/22daf5cdccbe3dc4af7e239495d54ed4c9d304f0/model/flow/chunk.go#L94-L104
The current
ChunkDataPack
implementation uses theID()
method to return theChunkID
field as the unique identifier: https://github.com/onflow/flow-go/blob/22daf5cdccbe3dc4af7e239495d54ed4c9d304f0/model/flow/chunk.go#L123-L127This approach assumes that
ChunkID
is sufficient to uniquely identify aChunkDataPack
. While this method is straightforward, it introduces potential malleability concerns because it does not consider the integrity of the other fields withinChunkDataPack
. Specifically, fields like:StartState
,Proof
,Collection
,ExecutionDataRoot
,are excluded from the identifier computation. This omission means that two
ChunkDataPack
instances with the sameChunkID
but differing values for these fields would produce the sameID
, which is incorrect.Proposed Solution
Key Changes
ID()
: To address these concerns, theID
() method will be updated to compute the identifier based on the entireChunkDataPack
struct using theMakeID
() function:By encoding all fields, this approach guarantees that the resulting
ID
is malleability-resistant. Changes to any field ofChunkDataPack
will produce a differentIdentifier
.All fields will be encoded in a way that guarantees consistency:
ChunkID
,StartState
, andProof
are encoded directly as raw bytes usingRLP
.Collection
-ID()
forCollection
struct needs to be updated first in #6721. RecursiveExecutionDataRoot
struct will be encoded asRLP
lists of their public fields, capturing the entire structure consistently (encoding rlp rules).FromChunkID
is unused and redundant. This function will be removed. AlsoChecksum()
function will be removed.Definition of Done
ChunkDataPack.ID()
method has been updated usingMakeID
. The identifier computation has been verified to useRLP encoding
orFingerprint
for all fields.FromChunkID
andChecksum()
functions has been removed.ID()
method.