onflow / atree

Atree provides scalable arrays and scalable ordered maps.
https://onflow.org
Apache License 2.0
40 stars 16 forks source link

Reduce RAM and persistent storage by deduplicating inlined dict type info #369

Closed fxamacker closed 6 months ago

fxamacker commented 7 months ago

Closes #358 Updates epic #292

This PR deduplicates Cadence dictionary type resulting in reduced memory and persistent storage.

More specifically, this encodes inlined atree slab extra data section as two-element array:

TODO (est. 1.5 - 2 days effort total): [x] - initial deduplication of dict type info [x] - build, run, and pass migration with validation enabled for all accounts and all payloads from mainnet snapshot [x] - improve deduplication (1/2 day effort to wrap up + rerun test using mainnet snapshot)

UPDATE (Feb 29, 2024): :stop_sign: Work on this PR is paused to work on urgent request to create tooling for Cadence 1.0 migration.

NOTE: This PR is expected to produce small improvements to atree inlining & deduplication results :bar_chart:.


codecov-commenter commented 7 months ago

Codecov Report

Attention: Patch coverage is 62.92135% with 66 lines in your changes are missing coverage. Please review.

Project coverage is 70.38%. Comparing base (2fbf860) to head (35fdb7e). Report is 8 commits behind head on feature/array-map-inlining.

Files Patch % Lines
typeinfo.go 61.48% 43 Missing and 14 partials :warning:
array_debug.go 30.76% 6 Missing and 3 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## feature/array-map-inlining #369 +/- ## ============================================================== + Coverage 62.45% 70.38% +7.92% ============================================================== Files 15 15 Lines 10919 12653 +1734 ============================================================== + Hits 6820 8906 +2086 + Misses 3119 2749 -370 - Partials 980 998 +18 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

fxamacker commented 7 months ago

I found some improvements to deduplication while migration + validation was running last night.

fxamacker commented 7 months ago

Work on this PR is paused this morning to meet about and begin work on urgent request to help with Cadence 1.0 migration.

This PR has about 1/2 day of effort left to improve its deduplication for further reduction of memory and persistent storage.

fxamacker commented 6 months ago

Yesterday, this passed atree inlining & deduplication migration with validation enabled (for all payloads for all accounts) using mainnet Jan 19, 2024 checkpoint as input.