endojs / endo

Endo is a distributed secure JavaScript sandbox, based on SES
Apache License 2.0
804 stars 71 forks source link

Need schema-like compression to avoid storing and transmitting redundant data. #2112

Open erights opened 6 months ago

erights commented 6 months ago

Describe the bug

We are spending way too much storage space on redundant boilerplate data with is redundant with, and implied by, the patterns that the data must match. This bug records the need for compression/decompression support, as would be closed by https://github.com/endojs/endo/pull/1584 . Separate bugs like https://github.com/Agoric/agoric-sdk/issues/3167 record the need to actually use such compression to reduce storage or transmission, as would be aided by https://github.com/Agoric/agoric-sdk/pull/6432

(From https://github.com/Agoric/agoric-sdk/pull/6432#issuecomment-1524472247 ):

For example without compression, the Zoe proposal

    {
      want: {
        Winnings: {
          brand: moolaBrand,
          value: makeCopyBagFromElements([
            { foo: 'a' },
            { foo: 'b' },
            { foo: 'c' },
          ]),
        },
      },
      give: { Bid: { brand, value: 37n } },
      exit: { afterDeadline: { deadline: 11n, timer } },
    },

is stored with a smallcaps body of

'#{"exit":{"afterDeadline":{"deadline":"+11","timer":"$0.Alleged: timer"}},"give":{"Bid":{"brand":"$1.Alleged: simoleans","value":"+37"}},"want":{"Winnings":{"brand":"$2.Alleged: moola","value":{"#tag":"copyBag","payload":[[{"foo":"c"},"+1"],[{"foo":"b"},"+1"],[{"foo":"a"},"+1"]]}}}}'

But it compresses with the proposalShape

    harden({
      want: {
        Winnings: {
          brand: moolaBrand,
          value: M.bagOf(harden({ foo: M.string() }), 1n),
        },
      },
      give: { Bid: { brand, value: M.nat() } },
      exit: { afterDeadline: { deadline: M.gte(10n), timer } },
    })

to

[[['c'], ['b'], ['a']], 37n, 11n]

whose smallcaps body is

'#[[["c"],["b"],["a"]],"+37","+11"]'

which is 12% as long.