Open szg251 opened 1 year ago
Plutus scripts include data decoders for the data types they use, adding to script sizes. Generally, we can reduce sizes by changing all product type representations:
data Foo = Foo { x :: Integer,
y :: BuiltinByteString
}
The default serialisation (by PlutusTx) is Constr 0 [x, y]
, while the better way is to just serialise it as [x, y]
.
With this generally applicable optimisation only, we can reduce the sizes by a considerable margin:
fromBuiltinData: OK
Target: generated; size 828
Measured: handwritten; size 797
Remaining headroom: 31
Script size changes (optImised internal data types only):
Size
Core
mkMintingPolicy (FUEL): OK
Size: 1039
mkMintingPolicy (FUEL) serialized: OK
Remaining headroom: 30
mkMintingPolicy (MerkleRoot): OK
Remaining headroom: 42
mkMintingPolicy (MerkleRoot) serialized: OK
Remaining headroom: 84
mkCommitteeCandidateValidator: OK
Size: 201
mkCommitteeCandidateValidator (serialized): OK
Remaining headroom: 21
mkCandidatePermissionMintingPolicy: OK
Size: 147
mkCandidatePermissionMintingPolicy (serialized): OK
Remaining headroom: 49
mkCommitteeHashPolicy: OK
Size: 400
mkCommitteeHashPolicy (serialized): OK
Size: 2853
mkUpdateCommitteeHashValidator: OK
Remaining headroom: 31
mkUpdateCommitteeHashValidator (serialized): OK (0.01s)
Remaining headroom: 100
mkCheckpointValidator: OK
Remaining headroom: 62
mkCheckpointValidator (serialized): OK
Remaining headroom: 128
mkCheckpointPolicy: OK
Size: 400
mkCheckpointPolicy (serialized): OK
Size: 2853
Distributed set
mkInsertValidator: OK
Remaining headroom: 29
mkInsertValidator (serialized): OK
Remaining headroom: 40
mkDsConfPolicy: OK
Size: 457
mkDsConfPolicy (serialized): OK
Size: 2884
mkDsKeyPolicy: OK
Size: 1228
mkDsKeyPolicy (serialized): OK
Remaining headroom: 40
Original comment from: @kozross
However, keeping in mind both current and future needs (readability, maintenance, stability), there's a few ways we can roll out these improvements. I'll list them below, along with my thoughts.
This is essentially what is currently on my branch. This involves some pretty repetitive, low-level and frankly un-idiotimatic (even by Plutus standards) code. While I can certainly explain how to do this kind of work (and it's pretty mechanical), it's definitely not fun, or readable. Pros of this approach: it's about as explicit as it gets (everything's right there). Cons of this approach: it's not great for readability (we'd need a writeup explaining this and the decisions around it), it's a pain to maintain (same reason) and if we ever decide it needs changing or there's more improvements to be had, we have to fix every single instance. I don't recommend this approach.
Essentially, this involves writing makeIsDataProduct
or something similar, which effectively generates the same code we'd get with Option 1. We'd have control over this derivation, and while writing it is a pain, it's a pain we have to experience once. Furthermore, unless you deeply care about this, it's not something you have to understand if you just want Data instances. Lastly, because we're in control, a Plutus update can't pull the rug out from under our feet.
Pros of this approach: no worse than what we do currently, we control it for stability, optimization can be done in one place instead of every instance.
Cons of this approach: TH is a royal pain to write and maintain.
I'm a cautious fan of this approach.
Essentially, this would involve writing functions like this for all product arities we have (up to 6 at the moment):
{-# INLINE productToData2 #-}
productToData2 :: forall (a :: Type) (b :: Type) . (ToData a, ToData b) => a -> b -> BuiltinData
{-# INLINE productFromData2 #-}
productFromData2 :: forall (a :: Type) (b :: Type) (c :: Type) . (FromData a, FromData b) => BuiltinData -> (a -> b -> Maybe c) -> Maybe c
{-# INLINE productUnsafeFromData2 #-}
productUnsafeFromData2 :: forall (a :: Type) (b :: Type) (c :: Type) . (UnsafeFromData a, UnsafeFromData b) => BuiltinData -> (a -> b -> c) -> c
Then, we would define instances for our types like so:
data Foo (a :: Type) = Foo Integer a
instance (ToData a) => ToData (Foo a) where
{-# INLINEABLE toBuiltinData #-}
toBuiltinData (Foo x y) = productToData2 x y
instance (FromData a) => FromData (Foo a) where
{-# INLINEABLE fromBuiltinData #-}
fromBuiltinData dat = productFromData2 dat (\x y -> Just (Foo x y))
instance (UnsafeFromData a) => UnsafeFromData (Foo a) where
{-# INLINEABLE unsafeFromBuiltinData #-}
unsafeFromBuiltinData dat = productUnsafeFromData2 dat Foo
Pros of this approach: no TH as in Option 2, no awful soup as in Option 1, fairly explicit, not too much maintenance (only change a fixed number of functions, not every instance), the easiest to implement out of the three Cons of this approach: in theory, this should all inline away, but in practice, we can't be sure until we try, still fairly repetitive. I'd be OK with this.
Issue by: kozross Original date: 2023-06-14 21:15:43 UTC Originally opened as: mlabs-haskell/trustless-sidechain/issues/484 Original assignees: kozross Status on 2023-06-20: open
Description
Follow-on from input-output-hk/trustless-sidechain#426. Currently, we're seeing significant size blowouts when comparing scripts measured as
CompiledCode
versus their serialized forms. This could be due to the 'bundling' ofData
deserialization code: we frequently use autogenerated instances, which are suboptimal in some cases, many of which we encounter. For example, product types are always encoded asConstr
, even though we end up carrying around a tag which we never need, but still have to store and match on. Furthermore, instead of re-usingfromBuiltinData
inUnsafeToData
, the generated code duplicates this functionality, causing much more duplication than necessary.This somewhat supercedes input-output-hk/trustless-sidechain#477 and encompasses some parts of input-output-hk/trustless-sidechain#480.
Goals
Data
-related instances improves serialized script sizeTests
As these instances are now manually written, some additional tests should be written. These should verify the following all hold:
fromBuiltinData . toBuiltinData = Just
unsafeFromBuiltinData . toBuiltinData = id
It would also be good to include some tests that 'bad' encodings fail to deserialize, but these are type-specific and may not always be practical. QuickCheck is appropriate for such tests.
Measurements of just the
Data
-related methods would probably be good to have also.Related issues/PRs