onflow / atree

Atree provides scalable arrays and scalable ordered maps.
https://onflow.org
Apache License 2.0
39 stars 13 forks source link

Deduplicate data when encoding nested Cadence composites to reduce memory and storage #347

Closed fxamacker closed 9 months ago

fxamacker commented 9 months ago

Updates #292 https://github.com/onflow/flow-go/issues/1744 https://github.com/onflow/cadence/issues/1854

When encoding nested Cadence composites, it is possible for redundant data to get encoded. For example, the same Cadence composite type and its field names can get encoded more than once when the same composite is included within the slab (aka payload, register, segment) more than once.

Suggested Solution

Deduplicate data (such as field names and types) within same segment when encoding nested composites.

Include data deduplication in PR #342 since the redesign of Atree to support inlining segments in PR #342 and #345 already requires changing the encoding format.

Combining this work with Atree Inlining avoids creating redundant work and saves time by implementing it within PR #342.

What is Outside Scope

This data deduplication is limited to the scope of a slab. It is intentionally not global deduplication. See https://github.com/onflow/cadence/issues/1854 for other aspects.

fxamacker commented 9 months ago

Closed by #342 #345