Open effectfully opened 2 years ago
A lot of from the above has to be changed to become more efficient/expressive/etc. We'll need both major refactoring and small tweaking. This comment is about major refactorings.
TypeScheme
serves both for type checking and evaluation, which means passing around evaluation-irrelevant stuff at runtime, which is silly. So we need to split TypeScheme
into two parts. This is the highest priority task, because it affects performance and blocks lots of other things, some of which also affect performance. PRs: #4379, #4516TypeScheme
is generally inefficient with all those Proxy
s and dictionaries inside, that needs to be fixed somehow, but a direct attempt at inlining led to nowhere. Once TypeScheme
is split into two parts in a direct way, we'll need to think how to monomorphize functions stored in dictionaries. High priority, because affects performance. PRs: #4317, #4397, #4398, #4421plutus-core/plutus-core/src/PlutusCore/Default/Builtins.hs
is compiled efficiently is high priority, because so far not much effort went into this direction, but that does not require any major refactoring, so that will be discussed in the next comment). UPD: there doesn't seem to be anyway around doing caching, it's just way too superior to trying to inline everything, especially given that caching benefits costing (see #4505) and inlining everything builtins-related would destroy that. UPD 2: in the end #4914 fully achieved what we neededterm
, because that should allow us to implement deep unlifting of value of recursive built-in types (like Data
) while keeping recursion on the Haskell side rather than the Plutus side. I'd expect that to be a lot faster. Low priority, because affects performance, but potentially requires non-trivial research. PR: #6530Arbitrary
BuiltinMeaning
sMinor things, all fairly low priority:
ToBinds
or some other type family is worth exploring. Also an erroring instance for DefaultUni `Contains` TyVarRep
would be very handy. PRs: #4345, #4403, #4557, #4648, #4649TypeScheme
s are shared (which is how we got that 30% slowdown when we stopped storing meanings of builtins in an array: toBuiltinMeaning
creates a ridiculous amount of irrelevant thunks just to use a few of them and throw the others away). Is it fine, can we do better? We'll need to figure out. UPD: we're doing caching and the thunks are created once per the set of builtins due to inlining (clearly visible in generated Core), so it's all finemakeKnown
and readKnown
work in a constrained m
. Monomorphizing those may help performance-wise. PRs: #4307, #4308, #4536readKnown
unlifts a value of a built-in type and for that it checks that the expected type tag is equal to the actual one. However instead of taking the expected tag from the global scope, it takes it from the constraint that the default implementation has. And so an obvious optimization does not happen. PR: #4380toConstant
/fromConstant
calls for a particular term
for better performance or do we want to move term
out of the class head of KnownTypeIn
like was done (and then reverted, because that caused a slowdown) in #4172? Is it possible to mix these two things? UPD: we gave up with the latter. PRs: #4419, #4481, #4499, #4533Refl
to obtain a proof that the value is of the expected type. Would using unsafeCoerce
give us any performance boost? Apparently, not, PR: #4400$w$cfromInteger
calls. What are those and can we get them inlined? PR: #5062makeKnown
for Integer
:
case ww4_s1bxe `cast` <Co:5> of dt_XiSB
{ DefaultUniInteger ipv_s1apx -> (ValueOf dt_XiSB vx_ikbc) `cast` <Co:14>
})
what is this matching on DefaultUniInteger
for? UPD: no longer see it. It was probably used in that last cast
.
geq
reducible statically in one way or another. PRs: #4462, #4463, #5061KnownTypeIn
into two type classes: one for lifting and one for unlifting. This is due to the fact that we have some types values of which can be lifted but not unlifted: EvaluationResult
and Emitter
. SomeConstantOf
was an example of the opposite, but it's gone now. PR: #4420fun
(properly constrained of course) checks that Arbitrary
arguments don't trigger an exception. PRs: #4555, #4576(# #)
and only be lazy there. PRs: #4607, #4778let !cost :: (# #) -> ...
in CostingFun/Core.hs
so that the result of the function can be unboxed. UPD: I've tried it and nothing worked, it wouldn't unbox and it would inline insteadEvaluationSuccess
should be made strict. PR: #4512transformers
are way too lazy for our use case, we need something stricter. PR: #4587lists
and nofib
do work quite OK for thatmatchList
equalsData
as we process the two sides ensuring they're equal? In that case whenever we get things that aren't equal, we can return False
immediately without continuing to emit unnecessary costs. This would slow down the happy case though (twice as much work?), so resolution: won't doheadSpine
a class method so that we can ensure well-typedness of the builtin _BuiltinFailure
is TH-derived and is not inlined. We should probably inline it just for tidier Core (it probably doesn't really affect evaluation time) equalsData
, so that if there's any mismatch we don't keep emitting costs pointlesslyDefaultUni
and DefaultFun
to CardanoUni
and CardanoFun
and pull them out into their own sublibrary with all the specific code that they depend upon to make it easier to review and maintain Cardano-specific builtins-related code. CostingFun/Core.hs
Maybe
or MonadPlus
in Universe.Core
. Also probably worth monomorphizing tryUniApply
and what it depends uponminCostStream
I didn’t pay much attention to its performance. There may exist ways of optimizing it, for example should we make unconsCost
return a (# CostingInteger, (# (# #) | CostStream #) #)
or use UnliftedDatatypes
or something? Same about HeadSpine
flattenCostRose
. Do we want this function to emit the next cost in O(1) worst case? Is it best to define it by concatenating forests in an accumulator or via CPS (all the code is at the link) performance-wise? We probably need some kind of benchmark to answer these questions.BuiltinRuntime
lazily explicitly to make the NoThunks
accurate (currently it's not). PR: #5806Proxy
to the universe, so that it's possible to provide builtins like nil
(currently impossible) or pair
(currently subverts the type checker). PR: #4337Forall
s in type signature of builtins so that we don't spend a lot of time elaborating them and also make things more clearRuntimeSchemeResult
yet every time a builtin is forced/applied. Plus it's good to reflect restrictions in the types anyway, otherwise somebody might add a nullary builtin which the CEK machine (and god knows what else) doesn't support. PR: #4616 (not beneficial for performance, but we should still do it)forall a. Integer -> Array a -> a
looks perfectly definable to me even with a
not being a built-in type (assuming arrays are in the AST and not built-in)The comment about major refactorings does not include two big things:
It's probably too early to think about the latter right now, given how much the whole builtins machinery is going to change (if everything works out), so I'll focus on the former.
The inline docs in the source code are fairly good, they might use some polishing, but overall they seem pretty comprehensive. What is missing however is a high-level overview of the whole builtins machinery that one can read before diving into the specifics of each of its parts. We do have one such file, Builtins.md
, but it was written long ago and it doesn't talk about polymorphic built-in types, builtin meaning inference etc.
Another thing we need is a doc (and presentation) on how to add a new built-in type or function. After all the recent refactorings it's completely trivial to add a new built-in type (regardless of whether it's monomorphic or polymorphic): it's basically just warnings/errors-driven copy-paste, but adding a new built-in function is trickier, because the polymorphic function over a polymorphic built-in type case is so much different to the other cases (improving upon the status quo is one of the major tasks).
PRs:
Performance improvements in builtins-related PRs (each number is the mean performance change for validation
benchmarks):
This issue is for dumping all the plans regarding builtins in a largely unstructured way.