Closed phadej closed 1 year ago
These are somewhat artificial. Type abstractions in typed PLC block evaluation, so we erase abstractions to delay
and instantiations to force
. Builtins can have type arguments, so the corresponding type instantiations get turned into force
s. We thought the simplest and most consistent approach was just to maintain these, although strictly speaking they're redundant.
Checking some more, E.g. sndPair
has always two force
s (for both of pair types), chooseList
has also two, If each force
is a byte, there's plenty of bytes we could save.
The primitives don't appear that much in the grand scheme of things. If we were concerned about this then the number of force
/delay
s introduced from erasing type abstractions/instantiations throughout the whole program is going to dwarf that. But I don't know of a good way to get rid of them en masse (we do the obvious force/delay
cancellation already).
(Also term tags are actually only 4 bits, not a whole byte.)
I run a quick test (rewrite Force b@Builtin{} -> b
) on use-cases Uniswap contract:
expected: (11462,11001)
but got: (11011,10777)
The counts are (node size, serialized size of term using DeBruijn names), that's 4% reduction in node count and 2% serialized size. I think that is significant in the grand scheme of things.
Thanks, that's a useful datapoint!
Let me elaborate a bit more on why this is tricky.
Consider the following program (in pseudo-PIR):
let usesTrace :: (forall a. BuiltinString -> a -> a) -> Integer -> Integer
usesTrace tracer i = tracer "I saw it" i
evilTrace :: forall a . BuiltinString -> a -> a
evilTrace s v = error "I'm evil"
i1 = usesTrace trace 0
i2 = usesTrace evilTrace 1
in i1 + i2
The usesTrace
function receives as its argument a polymorphic function which it instantiates. Builtin functions are just values, so we can pass a polymorphic builtin function as its argument. Therefore whatever usesTrace
does has to work for both "normal" functions and builtins. The first thing that it will do is to instantiate the function, which will correspond to a force
when erased.
So at the very least it must be tolerable to force builtin functions. At the moment we require you to force exactly where you would otherwise pass a type. Perhaps we could not require you to do any forcing. The only problematic case would be a nullary builtin which had only type arguments and immediately failed (e.g. error :: forall a . a
as a builtin) - it's not clear what that would mean. We could just ban them.
I have a feeling we've been through this discussion before, @kwxm do you remember anything else?
I don't agree, it is not tricky:
You can do rewrite in two steps:
Let me write (builtin b)
for polymorphic builtin, and (builtin! b)
for one which is absorbed force arguments (i.e. cannot be forced).
Your example would be rewritten as:
usesTrace (builtin trace) 0 -> -- replace with delayed versions
usesTrace (delay (builtin! trace)) 0 -- note: builtin!
And indeed, in the compiled PLC there is
[ [ usesTrace (builtin trace) ] (con integer 0) ]
That grows in size, but that doesn't actually happen. GHC rarely passes polymorphic arguments (there should be an explicit rank-N signature, no eta-expansion).
If we rewrite the example as (which is closer to what inferred types would be):
errorExample2 :: Integer
errorExample2 =
let usesTrace :: (BuiltinString -> Integer -> Integer) -> Integer -> Integer
usesTrace tracer i = tracer "I saw it" i
evilTrace :: forall a . BuiltinString -> a -> a
evilTrace s v = traceError "I'm evil"
i1 = usesTrace trace 0
i2 = usesTrace evilTrace 1
in i1 + i2
The line above becomes
[ [ usesTrace (force (builtin trace)) ] (con integer 0) ]
which would simplify:
usesTrace (force (builtin trace)) 0 -> -- replace with delayed versions
usesTrace (force (delay (builtin! trace))) 0 -> -- forceDelay
usesTrace (builtin! trace) 0 -- profit!
The polymorphic program has size 54, simple one 53, and latter could become smaller.
EDIT: What does (force (builtin trace))
evaluate to? Or is it stuck until there is enough arguments to execute the builtin?
That grows in size, but that doesn't actually happen. GHC rarely passes polymorphic arguments (have to an explicit rank-N signature).
Yes, you're certainly right about that, but even if it's a corner case we thought it was important for builtins to behave exactly the same as polymorphic functions. That way you don't have to worry about the corner cases.
You can do rewrite in two steps...
Nice idea. This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.
If we did that and banned nullary builtins, I think this could work.
What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?
It's stuck.
This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.
I think that removing the head type-arguments is already an improvement. If it's possible to shuffle the type arguments to the front, that would be even better, but do you have such builtins? (The specification doesn't list their types, nor haddocks, list their types, that would be helpful).
It looks like that types of builtins are generated by toBuiltinMeaning
?
I think GHC pushes all foralls to the front in inferred types. That trick is cool but feels dangerous, I'd prefer having types of builtins checked (against explicitly written ones) rather than inferred from their Haskell-meaning.
EDIT: if (U)PLC is a compilation target the types or "call convention" (in case of UPLC) is an important information to have in the spec.
What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?
It's stuck.
This is unfortunate. I liked the "builtins have to fully saturated" design more (when there isn't such problem). It's easier to show that semantics are preserved then. (My proposal would been completely trivial then).
We had a long argument about saturated vs unsaturated builtins. In the end we opted for unsaturated - perhaps a mistake, but here we are.
You can do rewrite in two steps...
Nice idea. This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.
If we did that and banned nullary builtins, I think this could work.
We've been able to gather some interest on this specific topic and Oleg's proposal. @michaelpj @phadej Should we add it to #4174 and attempt it in the upcoming weeks?
This is an intrusive proposal as it changes the UPLC. Everything listed in #4174 doesn't seem to require changes, even in typed PLC, as they can happen on PIR-level, which makes them a lot more realistic to get done :)
I opened this issue "early" because I expect it take long to fix, if it will ever.
I know, but it is well-scoped and elegant. We also suspect the improvements would be non-trivial, and our transpilers would take fewer force
s every day of the week!
Yes, this would be a change to the semantics of the language, and might well require a new language version and changes to the spec (which we're in the process of redoing already). It's a much more complicated change, and I don't think it will be that significant in the grand scheme of things. It's something we can hopefully roll in easily enough when we do do another version.
Okay, I'll just add this to #4174 for tracking for now.
@phadej
If it's possible to shuffle the type arguments to the front, that would be even better, but do you have such builtins?
We don't.
(The specification doesn't list their types, nor haddocks, list their types, that would be helpful).
See the golden files. Generating a haddock comment using doctest
or whatever would be quite helpful, I agree.
It looks like that types of builtins are generated by
toBuiltinMeaning
?
Within toBuiltinMeaning
, yes.
I think GHC pushes all foralls to the front in inferred types. That trick is cool but feels dangerous, I'd prefer having types of builtins checked (against explicitly written ones) rather than inferred from their Haskell-meaning.
We don't rely on GHC pushing the forall
s to the front (nor would it be possible at all, I think). Using an inference machinery and not checking what it infers would be very dangerous, I agree, but we have golden tests for that purpose. A golden file is created automatically whenever a new builtin is added, so we do control what PLC types get inferred for builtins.
EDIT: if (U)PLC is a compilation target the types or "call convention" (in case of UPLC) is an important information to have in the spec.
(The specification doesn't list their types, nor haddocks, list their types, that would be helpful).
Sorry, I just noticed this after it was quoted in a previous comment.
The plc
and uplc
commands now have a print-builtin-signatures
option that will print out the types of all the builtins it knows about in a reasonably readable form:
$ cabal exec uplc print-builtin-signatures
addInteger : [ integer, integer ] -> integer
subtractInteger : [ integer, integer ] -> integer
...
encodeUtf8 : [ string ] -> bytestring
decodeUtf8 : [ bytestring ] -> string
ifThenElse : [ forall a, bool, a, a ] -> a
chooseUnit : [ forall a, unit, a ] -> a
trace : [ forall a, string, a ] -> a
fstPair : [ forall a, forall b, pair(a, b) ] -> a
sndPair : [ forall a, forall b, pair(a, b) ] -> b
chooseList : [ forall a, forall b, list(a), b, b ] -> b
mkCons : [ forall a, a, list(a) ] -> list(a)
...
That still doesn't quite tell the full story because sometimes forall
means "for all built-in types" (as in mkCons
) and sometimes it means "for any term at all" (as in ifThenElse
). We're working on a new version of the specification which will explain all this in detail: that should be available in a couple of weeks.
@kwxm that's great.
There's now a CIP for this discussion, hence I'm closing the ticket. Feel free to reopen if you disagree.
by looking at some contracts' plutus core, there are some of
(force mkCons)
,(force trace)
,(force ifThenElse)
for example. I guess in these forcings are cheap to store and execute, but they aren't entirely free.EDIT: I think I haven't seen these combinators used unforced.