Closed effectfully closed 3 years ago
It seems we need the following builtin:
convert : all a b. a -> maybe b
See this for details.
But we do not allow parametric polymorphism in dynamic builtins right now and adding it is not that straightforward. Here are the problems:
Constant application must not throw away types, which it does currently (because we thought we'd never need type-dependent computations). It's not a big deal to keep types, however I've been willing to try a completely different constant application machinery since PlutusFest where I realized that there probably exists a way to get the benefits of saturated builtins without their issues (which I've never described, because didn't want to bring this huge and not super important topic up). So I'd rather completely rewrite the constant application machinery to what I think might be a better version rather than tweak the current version which I hope is going to vanish.
Let's say we have this dynamic builtin name: monoconst : all a. a -> a -> a
which takes two arguments and returns the first one. How are we going to evaluate monoconst {list bool} nil nil
? Or course, in order to evaluate nil
we need to use the dynamic builtin types machinery, namely the readDynamicBuiltin
function. That's not a problem, we already do that. But how do we know which type we're expecting to get? Looking at the example we see that it's supposed to be the denotation of list bool
, i.e. List Bool
, but how do we program that? We need to go from a PLC type to a Haskell type. Right now we do the opposite via the dynamic builtin types machinery, i.e. go from a Haskell type to a PLC type. But the other direction looks complicated. Not because it requires us going from a weakly typed thing to a strongly typed thing (we do that all the time, see Language.PlutusCore.Constant.Name.withTypedBuiltinName
for example), but because how would you actually realize that list bool
is a list of booleans? The list bool
type in PLC is a rather big AST, do we need to parse ASTs in order to figure out what the hell they actually mean?
But there is another thing we might do: since we have parametricity in the metalanguage, we can just not use the dynamic builtins machinery for parametric things. We can pass the AST directly whenever it's the case that we can't do anything with the value it represents due to parametricity! This means that instead of evaluating the expression from the above like that (pseudocode):
Plc.monoconst {list bool} nil nil
~ make (Haskell.monoconst @(List Bool) (read nil) (read nil))
~ make (Haskell.monoconst @(List Bool) [] [])
~ make []
~ nil
we can evaluate it like that:
Plc.monoconst {list bool} nil nil
~ Haskell.monoconst @Term nil nil
~ nil
The trade-off is that previously we've been always checking at runtime that types don't go wrong (some extra paranoia is always a good thing), while in the above we aren't checking that the type of nil
is list bool
. But maybe we can fix this as well by calling the actual type checker at runtime. Though, I guess we can live without type checking things at runtime for now.
Also we need to handle dynamic builtin names which are not completely parametric like this one for example:
reverse : all a. list a -> list a
It's supposed to compute like this:
PLC.reverse {bool} (cons true (cons false nil))
~ make (Haskell.reverse @Term (read @(List Term) (cons true (cons false nil))))
~ make (Haskell.reverse @Term [true, false])
~ make [false, true]
~ cons false (cons true nil)
I've recently (in #642) removed TypedBuiltinBool as a special case and realized that we can remove all special cases, even integers and bytestrings, like that. This is needed, because right now we can throw an error from a builtin name that normally returns an Bool, but not from one that returns an Integer, which is a silly limitation. The challenge however is in handling sizes, which I don't know how to approach at the moment.
So in the end I feel like it's easier and more sensible to just rewrite the whole dynamic builtin types and constant application machineries. This requires time, but certainly less time than handling all the described issues separately, because they affect big chunks of code and interfere.
Can you easily describe the changes you want to make, or would it be easier to just write it?
It is a bit awkward the the current mechanism for doing evaluation in Haskell doesn't work very well once we start wanting to pull types across.
Maybe this is a horrible idea, but we could have some builtins that don't use the typed builtin mechanism, but rather operate directly on the PLC terms. I think this makes particular sense for convert
, which is after all really a meta-language primitive, in that it's going to invoke the typechecker, which operates on terms anyway.
Can you easily describe the changes you want to make, or would it be easier to just write it?
I'm not yet sure what the changes should be.
It is a bit awkward the the current mechanism for doing evaluation in Haskell doesn't work very well once we start wanting to pull types across.
I really need to try implementing stuff before I can be sure it's awkward. What I described above makes sense to me and I'd be satisfied by a solution that handles those difficulties and doesn't add any weirdness.
Maybe this is a horrible idea, but we could have some builtins that don't use the typed builtin mechanism, but rather operate directly on the PLC terms. I think this makes particular sense for
convert
, which is after all really a meta-language primitive, in that it's going to invoke the typechecker, which operates on terms anyway.
I do not think it's a horrible idea at all and I agree that it makes perfect sense for convert
. However there can be other builtins that require parametric polymorphism, like the reverse
one mentioned above. So if we can make the machinery work once and for all builtins, that would be great. If we can't or it's too heavy/dirty, I'll consider other options, of course.
I'm considering right now changing the whole dynamic builtins story to something simpler.
This is a sub-issue of #487 that is specifically about dynamic builtin types and their future implementation what is meant to allow us to embed arbitrary Haskell values into the AST of Plutus Core in a well-controlled manner.
This seems obsolete, because in #724 and #733 we managed to implement decoding of values without this heavy feature.
I've been willing to try a completely different constant application machinery since PlutusFest where I realized that there probably exists a way to get the benefits of saturated builtins without their issues
I tried that and it didn't work.
But there is another thing we might do: since we have parametricity in the metalanguage, we can just not use the dynamic builtins machinery for parametric things. We can pass the AST directly whenever it's the case that we can't do anything with the value it represents due to parametricity!
This is my next focus.
I realized that the current way I handle sizes in dynamic builtins and constant applications is outdated. I believe we can do much better than that.
Still believe that, but there are problems (that we discussed by email) and this development stream is suspended, because the cast
thing (previously called convert
) is more important.
This seems obsolete, because in #724 and #733 we managed to implement decoding of values without this heavy feature.
I think this is right. I think we probably still want dynamic builtin types so we can have e.g. efficient strings when we want them. And I still think it would be good to be able to have no static builtins at all.
I think we probably still want dynamic builtin types so we can have e.g. efficient strings when we want them. And I still think it would be good to be able to have no static builtins at all.
Good point, I agree.
Implemented parametric polymorphism in #751 the way it's described above.
An interesting thought has occurred to me. Right now we use some weirdness in order to unlift products and especially sums. And unlifting of lists is not super nice either. What if we add a separate evaluator (and maybe unify it with the CEK machine at some point) that looks roughly like this (pseudocode):
unliftEvaluate :: Term tyname name res () -> EvaluationResult res
(where res
specifies the expected resulting type, hence we need to extend the AST probably as well)? The idea here is that when we unlift a data type, we always know a value of exactly what type we want to get in the end. But there is also one very important thing that we know too: if we apply an Unwrap
ped Plutus Core data type to a bunch of Haskell constructors (wrapped somehow into the Plutus Core AST), then we know that any call to the constructor is a final call and we won't need to compute anything else (it's the same logic as with single pattern matches that allow to unlift recursive data structures, because recursion is implicit in type class resolution). This means that we do not need to lift the result of such an application back to Plutus Core like we do normally, instead we can just return it from the abstract machine. And this does not require any Typeable
nonsense, because when we unlift a data type, all supplied constructors are required to construct the same data type, so we exactly know the resulting type.
This technique would allow to uniformly and very easily handle unlifting of arbitrary data structures. Need to unlift a Scott-encoded data type? Just apply it to its Haskell constructors and evaluate. The technique is a little bit heavy, but may worth it.
On the other hand it might be redundant if we find an acceptable way to embed the entire Haskell into Plutus Core, because we then can just add a constructor ExpectedResult
and match on it in order to ensure that the result of a computation has the expected type.
I'm still not sure I understand this suggestion. I think you're saying something like: for a value that is of a datatype, we can not just pattern match on it, we can fold it. And a fold can be type-changing - we can have a different output type, so we can fold into a Haskell value without having to go back into PLC.
But I think this still sticks on the problem that brought us to embedding Haskell into PLC: how do we apply our PLC fold/pattern match to a Haskell constructor? So far we've discussed:
I'm not sure whether you're gesturing at another way of solving this or what?
I think you're saying something like: for a value that is of a datatype, we can not just pattern match on it, we can fold it
No. It's one of the key components: we literally need only to perform a single pattern match, we do not need any folds. You've seen this note, right? We can use the same trick for direct unlifting of data types that doesn't go through a generic representation.
Spoiler: I'm not sure that what I wrote above would work. But let me show how it could work, cause in the end I'll implement something similar.
Imagine a PLC list in normal form (I will ignore type instantiations):
cons 1 (cons 2 (cons 3 nil))
in order to convert it to the corresponding Haskell list we need neither folds, nor to embed Haskell into PLC. Here is what we do: we unwrap and apply this term to two dynamic names: $nil
(standing for Haskell's []
) and $cons
(standing for Haskell's (:)
). We do not need to embed the two Haskell constructors into PLC as we can just use dynamic names the way we do it in other places. The embedding is only required if we want to construct a Haskell value inside PLC and manipulate it somehow inside PLC.
Hence
how do we apply our PLC fold/pattern match to a Haskell constructor?
that's not a problem at all, we can use dynamic names. The problem is how to deal with the result that the constructor returns once a constant application is computed. But we don't have this problem.
Okay, so we apply the PLC list to dynamic names standing for Haskell constructors and get something like this:
unwrap (cons 1 (cons 2 (cons 3 nil))) $nil $cons
In a few steps it reduces to
$cons 1 (cons 2 (cons 3 nil))
Note that the first $cons
stands for a meta-level (:)
now.
Then our evaluation engine unlifts 1
and cons 2 (cons 3 nil)
, transforms $cons
into its denotation and applies the denotation to the arguments. I.e. the expression reduces to
1 : [2, 3]
Note that the tail of the list gets unlifted as well even though we performed only a single pattern match! That's due to the nature of the constant application machinery that always unlifts all arguments passed to a builtin (modulo the recently implemented parametric polymorphism).
What I've described so far is how things already work. However instead of just returning [1, 2, 3]
we currently transform that list back to PLC. Again, due to the nature of the constant application machinery that always converts the result of a constant application back to PLC. But we don't want to do that! We just want to return the result. And we know what type it's supposed to have, so we don't need any Typeable
, unsafeCoerce
or anything of this kind.
Hence what I proposed above is to be able to communicate to the evaluator for what builtins the evaluator should not convert the result of a constant application back to PLC and instead should return it as is.
... which would work for normal-form terms, but with non-normal terms we may actually want to do something with the resulting Haskell list on the PLC side. For example we may have
(\x y -> y) (cons 1 nil) (cons 2 nil)
And in that case we do not want to blindly return [1]
as the firstly computed list, instead we want to actually evaluate the whole thing and get [2]
.
However, the terms that we get there are in fact in WHNF. And it looks like WHNF might be enough to make things work, but I'm not really sure. Need more thinking and I'll just try the idea out at some point, I guess.
Makes sense?
You've seen this note, right?
Yes, but there we're using typeclass dispatch on the Haskell side to decide what function to call on the results. Is the idea to do the same here? If so I guess I'm confused as to how it's an "evaluator" rather than just the unlifting we have already?
... which would work for normal-form terms, but with non-normal terms we may actually want to do something with the resulting Haskell list on the PLC side
Yes, this seems to be an issue. Aren't you proposing something like normalize-and-then-unlift? Isn't that what we already do?
Yes, but there we're using typeclass dispatch on the Haskell side to decide what function to call on the results. Is the idea to do the same here? If so I guess I'm confused as to how it's an "evaluator" rather than just the unlifting we have already?
What we have currently evaluates like like this:
readDynamicBuiltinCek (cons 1 (cons 2 nil))
case Right (1, readDynamicBuiltinCek cons 2 nil) of
Left () -> []
Right (x, xs) -> x : xs
1 : readDynamicBuiltinCek cons 2 nil
1 : case Right (2, readDynamicBuiltinCek nil) of
Left () -> []
Right (x, xs) -> x : xs
1 : 2 : case Left () of
Left () -> []
Right (x, xs) -> x : xs
1 : 2 : []
[1, 2]
What we want is:
readDynamicBuiltinCek (cons 1 (cons 2 nil))
1 : readDynamicBuiltinCek (cons 2 nil)
1 : 2 : readDynamicBuiltinCek nil
1 : 2 : []
[1, 2]
I.e. basically the same, but without the Either
and (,)
indirections.
We can achieve that by simply interpreting each PLC's cons
and nil
as Haskell's (:)
and []
respectively. The only thing that we need is a way to convince an evaluator not to wrap constant application results back into PLC, but rather return Haskell values directly.
Aren't you proposing something like normalize-and-then-unlift? Isn't that what we already do?
Right now we only reduce up to WHNF inside the abstract machines. If we attempt to perform some reductions separately and then unlift a PLC value, bad things will happen. WHNF seems enough, but maybe it's not.
Thinking about it again, it's probably the case that embedding Haskell into PLC worth a try. But regardless of what we do, I believe we don't need folds, it's an obsolete idea and we can have something much simpler.
Also, at some point I tried to define a tail-recursive function to map a tree:
data Tree a
= Leaf a
| Fork a (Tree a) (Tree a)
instance Functor Tree where
fmap f = go [] where
go stack (Leaf x) = roll stack (Leaf (f x))
go stack (Fork x l r) = go (Right (x, r) : stack) l
roll [] t = t
roll (Right (x, r) : stack) t = go (Left (t, x) : stack) r
roll (Left (l, x) : stack) t = roll stack (Fork (f x) l t)
This simulates construction of a tree using the left fold. It took me an entire evening (strangely, I was unable to find a tail-recursive map over trees neither in the OCaml, nor the ML worlds) and it would be hard to generalize this to other interestingly shaped data types. Unless I missed something, we're lucky we can avoid constructing data types in accumulators.
(Update: tried googling again and found a nice approach to tail-recursive maps over trees in this Scala answer).
Here are some thoughts that I have for now.
First of all, we do not want to change extensible builtin names yet. Let them be Text
for now, we can change this later as it does not allow us to do anything new and important, but requires to amend a lot of stuff thought the entire codebase.
So what we want to do currently is to extend Plutus Core with arbitrary Haskell types in a well-controlled manner. I envision this as follows:
Term
data type asdata Term tyname name ext ann
= ...
| Pure ext
type ExtTerm tyname name uni ann = Term tyname name (exists a. (uni a, a)) ann
Where uni
is a universe of additional types and exists
is pseudocode for existential quantification. This way we can put arbitrary a
into the AST of PLC provided a
is included in the universe of additional types. We can't just define ExtTerm
directly and avoid extending Term
with the generic Pure
altogether, because existentials often break deriving and it's better to preserve the ability to derive various instances, provided we do not pay much for this (I think we don't).
evaluateCk :: GEq uni => ExtTerm tyname name uni ann -> ExtTerm tyname name uni ann
where GEq
is this. We need this constraint in order to check whether a builtin name can be applied to an argument whose type lies in uni
(we also need to decide how to handle parametric polymorphism).
-- We probably want to use that together with `fastsum`.
-- But also allow @Either@ and use type families for computing the index of a type,
-- because we want to extend @uni@ in order to unlift values.
class uni `Includes` a where
uniVal :: uni a
data TypeScheme uni a r where
TypeSchemeResult :: uni `Includes` a => Proxy a -> TypeScheme a a
TypeSchemeArrow :: uni `Includes` a => Proxy a -> TypeScheme b r -> TypeScheme (a -> b) r
...
For example,
typedTakeByteString
:: (uni `Includes` Integer, uni `Includes` ByteString)
=> TypedBuiltinName uni (Integer -> ByteString -> ByteString)
typedTakeByteString =
TypedBuiltinName TakeByteString $
Proxy `TypeSchemeArrow` Proxy `TypeSchemeArrow` TypeSchemeResult Proxy
Another problem we have is that with this machinery in place we can embed values in a shallow (just wrap a Haskell value in the Pure
constructor) and a deep (convert a Haskell list to a PLC list for example -- we already do that) ways and so we need a way to distinguish between the two ways to embed values.
Not sure whether this is going to work, but at least we have something to start with.
Started implementing this stuff. I decided not to go with the approach described here:
A funny thought: we do not really need to change the AST in order to be able to embed arbitrary Haskell into Plutus Core. We can just carry an environment of Haskell values together with a Plutus Core term which references values in the environment by their indices (say, in the Var constructor, but it's better to have a separate MetaVar constructor for this). At the very least this should allow us to play with the ideas without needing to fix 10000 errors that adding additional parameters to Term would cause.
because it's an additional indirection and most of the 10000 errors can be fixed by find-and-replace
.
Now that I'm looking at it closely, I feel like we'll have to rethink quite a lot of stuff. E.g. previously we needed to pass evaluators around for unlifting of values (and we currently normally do unlifting of values during regular evaluation of PLC expressions deep inside the constant application machinery). But with proper extensible builtins, we no longer need that! Extensible builtins will allow to unlift a deeply embedded value (e.g. PLC list) to a shallowly embedded value (Haskell list wrapped as a PLC value) and then we can extract ("unembed") the result by means of simple pattern matching.
Here is how that would look for lists:
$nil : all a. metalist a
$cons : all a. a -> metalist a -> metalist a
unliftList : all a. list a -> metalist a
unliftList {a} xs = unwrap xs {metalist a} $nil (\x xs -> $cons x (unliftList xs))
The problem here however is that unliftList
unlifts only the spine of a list, but not the actual elements. In order to unlift the elements, we'll also need to map a function that unlifts a single element over the PLC list beforehand (or map such a function over the Haskell list afterwards). Or just directly write
unliftList : all a b. (a -> b) -> list a -> metalist a
unliftList {a} unliftA xs = unwrap xs {metalist a} $nil (\x xs -> $cons (unliftA x) (unliftList unliftA xs))
Typeclasses are perfectly suitable for this and we use them currently, but if we decide to simplify things and not to pass evaluators around (which also allows to make the constant application machinery pure again) and keep most of the unlifting logic on the Plutus side, then we don't have type classes there and need to deal with that.
It is not clear to me at the moment whether we can get the benefits of the current approach, but do not inherit its complexity. But some unlifting of values seems to be doable without the complexity that we currently have.
I'll investigate further and probably I'll just implement things in two stages:
We can't just define
ExtTerm
directly and avoid extendingTerm
with the generic Pure altogether, because existentials often break deriving and it's better to preserve the ability to derive various instances, provided we do not pay much for this (I think we don't).
We actually do. The thing is, we want to parameterize both Type
and Term
by the same uni
like this:
data Type tyname uni ann
= forall a. TyMeta ann (uni a)
| ...
data Term tyname name uni ann
= forall a. Meta ann (uni a) a
| ...
This way we have representations for embedded types and embedded values in PLC.
For this, we can parameterize both Type
and Term
by distinct type variables (tydyn
and dyn
respectively) and instantiate them in different ways later to get extensible builtins (I elaborate on this in the first message in this thread), but we already have plenty of type variables. There is a sensible alternative:
data Some f = forall a. Some (f a)
data SomeOf f = forall a. SomeOf (f a) a
data Type tyname uni ann
= TyMeta ann (Some uni)
| ...
data Term tyname name uni ann
= Meta ann (SomeOf uni)
| ...
This way we hide all the existentiality into two dedicated data types (Some
and SomeOf
), thus deriving for Type
and Term
works perfectly well, provided you first define the relevant instances for Some
and SomeOf
(with the two additional type variables approach we can define instances for Some
and SomeOf
after deriving them for Type
and Term
, but this doesn't seem to be a huge benefit).
Hiding existentiality in bespoke data types is what I use in the other project and it works well, so I'm going to try that out with Plutus.
Unsurprisingly, we run into difficulties while trying to define various instances in this existential setting. For example, in order to derive NFData
for Term
, we need to define NFData
for SomeOf uni
, which means that we need to rnf
the a
from SomeOf uni
, but that a
is existentially quantified and so we can't just require it to have an NFData
instance. However, we can require each type from uni
to have that instance and then any particular a
will have it as well. Looks like this:
class KnownUni uni where
type Everywhere uni (constr :: * -> Constraint) :: Constraint
bring :: uni `Everywhere` constr => proxy constr -> uni a -> (constr a => r) -> r
bringApply
:: (KnownUni uni, uni `Everywhere` constr)
=> Proxy constr -> (forall a. constr a => a -> r) -> SomeOf uni -> r
bringApply proxy f (SomeOf uni x) = bring proxy uni $ f x
instance (KnownUni uni, uni `Everywhere` NFData) => NFData (SomeOf uni) where
rnf = bringApply (Proxy @NFData) rnf
And an example:
data ExampleUni a where
ExampleUniBool :: ExampleUni Bool
ExampleUniInt :: ExampleUni Int
instance KnownUni ExampleUni where
type Everywhere ExampleUni constr = (constr Bool, constr Int)
bring _ ExampleUniBool = id
bring _ ExampleUniInt = id
example = rnf $ SomeOf ExampleUniInt 42
In short:
Everywhere
applies a constraint to each of the types that a universe has and collects the constraints into a single final constraintbring
ensures that each type from the universe has an appropriate instance, provided the entire universe satisfies the final constraintbringApply
brings a particular instance in scope (relying on bring
) and then applies a function that requires this instances to the value stored in SomeOf
Note that the KnownUni
machinery is a rather generic one and so it's suitable for other type classes, not just NFData
. This seems to be the most sensible approach regardless of whether we choose to use the parameterize-both-Type
-and-Term
-by-uni
approach or the parameterize-Type
-by-tycon
-and-Term
-by-con
one.
E.g. an Eq
instance for SomeOf uni
can be defined like this:
instance (GEq uni, KnownUni uni, uni `Everywhere` Eq) => Eq (SomeOf uni) where
SomeOf uni1 x1 == SomeOf uni2 x2 =
case uni1 `geq` uni2 of
Nothing -> False
Just Refl -> bring (Proxy @Eq) uni1 (x1 == x2)
That sounds sensible, as usual I'm impressed you managed to come up with this! It's good that this seems generic, since we also need to handle things like Serialise
and friends.
Seems to work. Deriving for Type
& Term
works as well.
While looking at the sealed
builtin, I noticed that we need fancier constants too. A value of type sealed a
has to be some constant. So the sealed
constants need to be able to have integer
s, bytestring
s, arbitrary PLC types in them.
Roman tells me this should work with the approach he's developing, but just writing this down.
One of the things we want to do is extending the universe a term is defined in. Because that is how we're going to handle unlifting -- by adding the type, that we're unlifting into, into the current universe, computing on the PLC side and then extracting the Haskell result wrapped as a PLC value. However there is one annoyance: all the constraints that held in the original universe, do not hold in the extended universe without additional trickery. Consider
-- | Convert a @nat@ to an @integer@.
--
-- > foldNat {integer} (addInteger 1) 1
natToInteger :: (TermLike term TyName Name uni, uni `Includes` Integer) => term ()
natToInteger = ...
Here we define a term that can include integers, which is specified by
uni `Includes` Integer
Then having
shiftConstants :: Term tyname name uni ann -> Term tyname name (uni :+: ext) ann
which extends the universe with ext
(:+:
comes from GHC.Generics
), we can embed natToInteger :: Term TyName Name uni ()
into Term TyName Name (uni :+: ext) ()
, but we won't be able to, say, apply this embedded term to some another term that contains integers, because
(uni :+: ext) `Includes` Integer
simply doesn't hold. It is of course morally true that the extended universe contains all the types from the original universe, but convincing GHC is not an easy thing here, because the extended universe contains not only those types, but also the types from the extension and this means we need an instance like
instance (uni `Includes` a) OR (ext `Includes` a) => (uni :+: ext) `Includes` a
and there is no OR
for constraints. It is probably possible to perform instance search manually by using type families or some other weirdness. Or something like Data.Reflection
might work, if we're willing just to generate instances on the fly:
withExtendedUniL :: uni `Includes` a => ((uni :+: ext) `Includes` a => c) -> c
withExtendedUniR :: ext `Includes` a => ((uni :+: ext) `Includes` a => c) -> c
But this is way too heavy and hacky.
One thing that we can try for now is to always extend universes with a single new type and don't let this type participate in instance seach and instead always search in the original universe. Something like this:
data Extend b uni a where
Extension :: Extend b uni b
Original :: uni a -> Extend b uni a
instance uni `Includes` a => Extend b uni `Includes` a where
knownUni = Original knownUni
That might be enough for our particular use case, because we only want to dynamically extend universes for unlifting purposes, but that happens under the hood and is not visible to the user, who always deals with a single blockchain-specific universe of types.
One thing that we can try for now is to always extend universes with a single new type and don't let this type participate in instance seach and instead always search in the original universe.
Did that and it seems to work well.
While diving into the details of unlifting, I realized there is a small problem. Consider unlifting of tuples. We want to apply a PLC term representing a tuple to a dynamic builtin name representing the Haskell's (,)
constructor. Note that since we know all the types, (,)
doesn't have to be polymorphic, it's just (,) :: A -> B -> (A, B)
for some particular A
and B
(in our generic case they're just free variables). Now since for stage 1 I'm implementing the old machinery in terms of the new one (to avoid breaking everything and because I'm not sure we'd no longer need the old machinery), the values of types A
and B
have to be converted from PLC to Haskell via the KnownType
machinery. And since we're constructing an actual Haskell tuple and then wrap it as a PLC value, we have to add the type of tuples to our current universe of types. But this means that we convert from PLC to Haskell in one universe and then embed Haskell in PLC in another universe (the original universe extended with tuples). Which is a pretty annoying technical problem of how to unlift values in one universe and then wrap them in some another one.
For now I'm just going to pretend we only have to deal with the extended universe and throw an error whenever something goes wrong (we already throw a lot of errors (purely) in that code, so it's not a big deal) and then we'll see whether this can be improved in stage 2.
Another stupid obstacle: currently KnownType.readKnown
has the following type signature:
readKnown :: Monad m => Evaluator Term m -> Term TyName Name () -> ReflectT m a
Once we've added the uni
type parameter there are two options:
readKnown :: Monad m => Evaluator Term m -> Term TyName Name uni () -> ReflectT m a
and
readKnown :: Monad m => Evaluator Term uni m -> Term TyName Name uni () -> ReflectT m a
I.e. the choice is between Evaluator
being able to handle any universe or the specific universe that the term we're unlifting is defined in. However the latter choice does not quite work, because we often want to extend the universe with an additional type (e.g. in order to unlift the Scott-encoded PLC's unit, we apply that term to Haskell's ()
and then match on the result in order to ensure we got ()
back).
So we're left with Evaluator
being universe-polymorphic (i.e. Evaluator Term m
). However we still have static builtins and that means we have to hardcode uni
to contain all the types that are currently static (but this time via constraints rather than on the AST level). This is fine as a first approximation, but we run into a more annoying problem, consider one definition from plutus-core-interpreter
:
evaluateInCekM :: EvaluateConstApp (Either CekMachineException) a -> CekM (ConstAppResult a)
evaluateInCekM a =
ReaderT $ \cekEnv ->
let eval means' = evaluateCekCatchIn $ cekEnv & cekEnvMeans %~ mappend means'
in runEvaluateConstApp eval a
Here we wrap the CEK machine as an evaluator and use that for unlifting values (which we do during the execution of the CEK machine). But we also extend the environment used for unlifting (means'
) with the environment of dynamic builtin names that are currently in scope (cekEnv
). And we can't typify that when Evaluator
is polymorphic over uni
(i.e. there is forall uni. ...
inside Evaluator
), because it has to have the same uni
that we're currently in, otherwise it's impossible to merge the two environments, since they are defined in distinct universes.
This asks for something like Kripke semantics for evaluators. Instead of being parametric over environments or hardcoding environments we can say that any environment is fine as long as it's an extension of the original environment. But then it's not immediately clear whether it's possible to define the abstract machines in such an environment-extending way.
Not a showstopper, but requires more thinking.
However the latter choice does not quite work, because we often want to extend the universe with an additional type (e.g. in order to unlift the Scott-encoded PLC's unit, we apply that term to Haskell's () and then match on the result in order to ensure we got () back
Don't we always need just ()
? Or perhaps we could get away with integer
, if we wanted to use integers to e.g. tell us which branch we've got.
So could it be this?
readKnown :: Monad m => Evaluator Term (uni `with` integer) m -> Term TyName Name uni () -> ReflectT m a
Also, I'm a little confused since its sounds like you just discuss the case where we parameterize Evaluator
by universes. I'm not sure what goes wrong in the case where we have just Evaluator Term m
.
Don't we always need just ()? Or perhaps we could get away with integer, if we wanted to use integers to e.g. tell us which branch we've got.
So could it be this?
Yes, but that's what we were doing previously and it doesn't work for tuples for example, when you actually want to add tuples to the universe and add (,)
as a dynamic builtin name, so that you can construct a Haskell tuple on the PLC side and extract it later.
Also, I'm a little confused since its sounds like you just discuss the case where we parameterize
Evaluator
by universes. I'm not sure what goes wrong in the case where we have justEvaluator Term m
.
That's because I stupidly used "parametric over universe" instead of "universe-polymorphic". That evaluateInCekM
problem is about Evaluator Term m
. I'll edit the post.
Yes, but that's what we were doing previously and it doesn't work for tuples for example, when you actually want to add tuples to the universe and add (,) as a dynamic builtin name, so that you can construct a Haskell tuple on the PLC side and extract it later.
Okay, that makes sense - I would expect that we always want the type that we're unlifting to as well. Can we get away with just both of those?
My guess is not, because if we call makeKnown
recursively for subcomponents of the type, then we'll be wanting an evaluator with those types in its universe. So we need to be able to extend evaluators to handle a larger universe regardless.
That evaluateInCekM problem is about Evaluator Term m.
What confused me there was this sentence: "And we can't typify that when Evaluator is parametric over uni, because it has to have the same uni that we're currently in", which made me think that the problem was that we had different uni
parameters that we can't unify. So I don't see what the problem is if we don't have the uni
parameter.
I would expect that we always want the type that we're unlifting to as well.
It's quite more complicated than that. For example
instance KnownType a uni => KnownType (EvaluationResult a) uni where
toTypeAst _ = toTypeAst @a Proxy
makeKnown EvaluationFailure = Error () $ toTypeAst @a Proxy
makeKnown (EvaluationSuccess x) = makeKnown x
readKnown eval = mapDeepReflectT (fmap $ EvaluationSuccess . sequence) . readKnown eval
Here we do not add EvaluationResult a
to the universe.
Same for
instance (KnownSymbol text, KnownNat uniq, uni1 ~ uni2, Evaluable uni1) =>
KnownType (OpaqueTerm uni2 text uniq) uni1 where
toTypeAst _ =
TyVar () . TyName $
Name ()
(Text.pack $ symbolVal @text Proxy)
(Unique . fromIntegral $ natVal @uniq Proxy)
makeKnown = unOpaqueTerm
readKnown (Evaluator eval) = fmap OpaqueTerm . makeRightReflectT . eval mempty
And there are instances that do not extend the current universe, but instead require it to contain something. For example
data Meta a = Meta
{ unMeta :: a
} deriving (Show, Generic, Typeable)
instance (Evaluable uni, uni `Includes` a) => KnownType (Meta a) uni where
toTypeAst _ = constantType @a Proxy ()
makeKnown (Meta x) = constantTerm () x
readKnown (Evaluator eval) term = do
res <- makeRightReflectT $ eval mempty term
case extractValue res of
Just x -> pure $ Meta x
_ -> throwError "Can't read"
However if we get rid of the KnownType
machinery (which is what I'm planning to do at stage 2), we may have better luck with things being more uniform for the purposes of unlifting. Or not, I still want EvaluationResult
the way it works currently (and I'm also not sure how to get rid of this particular KnownType
instance, but I haven't thought about this yet).
What confused me there was this sentence: "And we can't typify that when Evaluator is parametric over uni, because it has to have the same uni that we're currently in", which made me think that the problem was that we had different uni parameters that we can't unify.
Another misuse of the parametric
term, sorry. Here is how it looks, when we have Evaluator Term m
:
newtype Evaluator f m = Evaluator
{ unEvaluator
:: forall uni. Evaluable uni
=> DynamicBuiltinNameMeanings uni
-> f TyName Name uni ()
-> m (EvaluationResultDef uni)
}
So we still have uni
, it's just bound inside Evaluator
. And when you're constructing a local Evaluator
to use it in constant application machinery for unlifting of constants, that inner universally quantified uni
, that you're expected to handle, can't unify with the particular global uni
.
This asks for something like Kripke semantics for evaluators. Instead of being parametric over environments or hardcoding environments we can say that any environment is fine as long as it's an extension of the original environment. But then it's not immediately clear whether it's possible to define the abstract machines in such an environment-extending way.
In fact, that seems to work. So we have
type uni <: uni' = forall a. uni a -> uni' a
newtype Evaluator f uni m = Evaluator
{ unEvaluator
:: forall uni'. Evaluable uni'
=> uni <: uni'
-> DynamicBuiltinNameMeanings uni'
-> f TyName Name uni' ()
-> m (EvaluationResultDef uni')
}
class PrettyKnown a => KnownType a uni where
toTypeAst :: proxy a -> Type TyName uni ()
makeKnown :: a -> Term TyName Name uni ()
readKnown :: Monad m => Evaluator Term uni m -> Term TyName Name uni () -> ReflectT m a
and
evaluateInCkM :: forall uni a. EvaluateConstApp uni (CkM uni) a -> CkM uni (ConstAppResult uni a)
evaluateInCkM =
runEvaluateConstApp $ Evaluator $ \emb meansExt term -> do
let extend means = embedDynamicBuiltinNameMeanings emb means <> meansExt
withReader extend $ [] |> term
So evaluators receive an additional argument that allows to embed dynamic builtin names in a bigger universe than the one they are defined in. Never thought I would find some use for Kripke semantics in the real world! Works like a charm...
... except it doesn't. The type signature of embedDynamicBuiltinNameMeanings
is
embedDynamicBuiltinNameMeanings
:: uni <: uni' -> DynamicBuiltinNameMeanings uni -> DynamicBuiltinNameMeanings uni'
Should be easy to embed dynamic builtin names in a bigger universe, right? No. It amounts to embedding TypeScheme
s:
embedTypeScheme :: uni <: uni' -> TypeScheme uni a r -> TypeScheme uni' a r
But TypeScheme
is defined as
data TypeScheme uni a r where
TypeSchemeResult :: KnownType a uni => Proxy a -> TypeScheme uni a a
TypeSchemeArrow :: KnownType a uni => Proxy a -> TypeScheme uni b r -> TypeScheme uni (a -> b) r
...
and if we attempt to embed TypeScheme uni a r
into TypeScheme uni' a r
, we quickly realize that KnownType a uni'
doesn't follow from KnownType a uni
even though we know that uni'
contains all the same types as uni
(and possibly some additional ones). It surely doesn't hold definitionally, but even with some trickery we get
instance (Evaluable uni, KnownType a uni, euni ~ Extend b uni, Typeable b) =>
KnownType (InExtended b uni a) euni where
toTypeAst _ = shiftConstantsType $ toTypeAst @a @uni Proxy
makeKnown (InExtended x) = shiftConstantsTerm $ makeKnown @a x
readKnown (Evaluator eval) term = _
which allows to apply makeKnown
to a term and inject the result into a bigger universe, but readKnown
is problematic, because we need to "uninject" from that bigger universe. Which most likely should work as per
For now I'm just going to pretend we only have to deal with the extended universe and throw an error whenever something goes wrong (we already throw a lot of errors (purely) in that code, so it's not a big deal) and then we'll see whether this can be improved in stage 2.
but this entire thing is getting too convoluted.
There are two things we can do here:
KnownType
machinery and can just directly inject the entire application into a bigger universeI'll play with the current code a little bit more (because it's nice to get something working, even if it's hacky) and then move to 1 and if it doesn't solve everything, then I'll consider implementing 2 as well.
What happens if we split KnowType
into an encoding class and a decoding class? They should have opposite "variances" wrt the universe, and for TypeScheme
s we only need to be able to decode the arguments and encode the result, so that might make things easier.
Hm no, that doesn't help because we'd still be "growing" the universe in both cases.
I think your suggestion for what to do for readKnown
makes sense: you have a function read : a -> Maybe b
, and you want to turn it into a function read' : a' -> Maybe b
where a < a'
. It makes sense that you just say "in the set a' \ a it's Nothing", i.e. "throw an error".
Hm no, that doesn't help because we'd still be "growing" the universe in both cases.
Exactly.
I think your suggestion for what to do for
readKnown
makes sense: you have a functionread : a -> Maybe b
, and you want to turn it into a functionread' : a' -> Maybe b
wherea < a'
.
Sure, but there are technical difficulties. With
type uni <: uni' = forall a. uni a -> uni' a
newtype Evaluator f uni m = Evaluator
{ unEvaluator
:: forall uni'. Evaluable uni'
=> uni <: uni'
-> DynamicBuiltinNameMeanings uni'
-> f TyName Name uni' ()
-> m (EvaluationResultDef uni')
}
we have no chance of figuring out how to map TypeScheme uni a r
to TypeScheme uni' a r
, because both uni
and uni'
are just variables and we can't get KnownType a uni'
from KnownType a uni
. Therefore our best bet here is to make evaluators receive
(forall a r. TypeScheme uni a r -> TypeScheme uni' a r)
rather than uni <: uni'
. Then every time we extend the current universe, we also make sure that KnownType
holds for each of the arguments and the result in the extended universe. Hence we define
shiftTypeScheme
:: forall b uni a r. (Evaluable uni, Typeable b)
=> TypeScheme uni a r -> TypeScheme (Extend b uni) a r
However we can't get KnownType a (Extend b uni)
from KnownType a uni
, because normally a
drives instance resolution for KnownType
and thus we'd get a terrible amount of overlapping instances. So we use the instance from my previous message:
instance (Evaluable uni, KnownType a uni, euni ~ Extend b uni, Typeable b) =>
KnownType (InExtended b uni a) euni where
where InExtended
is a wrapper around a
. This solves the instance search problem, but now we get KnownType (InExtended b uni a) (Extend b uni a)
from KnownType a uni
, i.e. we have also changed a
, so we can't directly define shiftTypeScheme
, because all types from the TypeScheme
are now wrapped in InExtended
. At this point I'm too tired to think of this nonsense, so here is the definition of shiftTypeScheme
:
shiftTypeScheme (TypeSchemeResult res) =
unsafeCoerce $ TypeSchemeResult $ mapProxy @(InExtended b uni) res
shiftTypeScheme (TypeSchemeArrow arg schB) =
unsafeCoerce $ TypeSchemeArrow (mapProxy @(InExtended b uni) arg) $ shiftTypeScheme schB
shiftTypeScheme (TypeSchemeAllType var schK) =
TypeSchemeAllType var $ \_ -> shiftTypeScheme $ schK Proxy
i.e. we simply use unsafeCoerce
to ignore the newtype
wrapper problem.
We also need the unshiftEvaluator
function in order to define readKnown
over InExtended b uni a
, which in turn requires unshiftTypeScheme
, which analogously to shiftTypeScheme
requires another newtype
wrapper and a KnownType
instance for it (dual to the one of InExtended
), which in turn requires the shiftEvaluator
function.
After everything is defined, we run the test
test :: EvaluationResult (Either Text ((Integer, ()), (Char, Bool)))
test = readKnownCk @_ @DefaultUni $ makeKnown ((5 :: Integer, ()), ('a', True))
and... it works! The result:
EvaluationSuccess (Right ((5,()),('a',True)))
This is awful.
Is the problem perhaps that our universe has too little structure? If we considered it more like a variable context in the way that e.g. James does in the metatheory, then we just want to do something like weakening, which should be more manageable because we can sensibly talk about "removing" things from our context and so on. This is very vague, though.
I'm still not sure I completely see the problem. Can't we write something like this:
restrict :: uni <: uni' -> Term TyName Name uni' a -> Maybe (Term TyName Name uni a)
which then we can then post-compose with readKnown
?
Is the problem perhaps that our universe has too little structure? If we considered it more like a variable context in the way that e.g. James does in the metatheory, then we just want to do something like weakening, which should be more manageable because we can sensibly talk about "removing" things from our context and so on. This is very vague, though.
It's a good thought, but I'm not sure what we can solve by this.
Can't we write something like this:
restrict :: uni <: uni' -> Term TyName Name uni' a -> Maybe (Term TyName Name uni a)
We do have a function of type
unshiftConstantsTerm :: Term tyname name (Extend b uni) ann -> Term tyname name uni ann
(with Nothing
being error
) however the problem is not with terms -- it's with dynamic builtin names. In order to unlift a term we sometimes need to add new types to the current universe and new dynamic builtin names (see an example in this issue). And these new dynamic builtin names are allowed to reference new types. We then pass those new builtins to the evaluator, which is expected to add them to the set of buitlins that were already in scope. But they were defined in a different universe! So we need to inject old dynamic builtins names into the extended universe in order to merge them with the new builtins defined in the extended universe. And since a part of the meaning of each dynamic builtin name is a TypeScheme
parameterized by a universe, this injecting amounts to embedding a TypeScheme
in a bigger universe. Which is possible, but not trivial (see the previous post).
Okay, reading the more closely, a couple more thoughts:
Re the instance resolution issues, I wonder whether we'd be better off having KnownType
be a datatype rather than a class. We're already reifying the dictionary into our data constructors a lot! Then there isn't an overlapping instances issue - we just have to explicitly construct the dictionaries we want, rather than relying on instance resolution to do the right sequence of things. I think it might be more boilerplate but fewer hacks.
we have no chance of figuring out how to map TypeScheme uni a r to TypeScheme uni' a r, because both uni and uni' are just variables and we can't get KnownType a uni' from KnownType a uni.
But we have a uni <: uni'
. I think we should be able to define
embedTerm :: uni <: uni' -> Term TyName Name uni a -> Term TyName Name uni' a
restrictTerm :: uni <: uni' -> Term TyName Name uni' a -> Maybe (Term TyName Name uni a)
The second one is problematic, as you said. But I think we should be able to define <:
as a "natural retract" rather than just a natural transformation, something like type f <: g = (forall a . f a -> g a, forall a . g a -> Maybe (f a))
.
Doesn't this let us get from KnownType a uni
to KnownType a uni'
? Imagining we had explicit dictionaries (in quasi-Haskell):
extendKnownType :: uni <: uni' -> KnownType uni a -> KnownType uni' a
extendKnownType ev KnownType{makeKnown1, readKnown1} = KnownType{
makeKnown a = embedTerm ev $ makeKnown1 a,
readKnown eval t = case restrictTerm ev t of
Nothing -> fail
Just t -> readKnown1 (restrictEval ev eval) t
Re the instance resolution issues, I wonder whether we'd be better off having KnownType be a datatype rather than a class. We're already reifying the dictionary into our data constructors a lot! Then there isn't an overlapping instances issue - we just have to explicitly construct the dictionaries we want, rather than relying on instance resolution to do the right sequence of things. I think it might be more boilerplate but fewer hacks.
It's not a hack to use a newtype
to drive instance resolution. unsafeCoerce
is a terrible hack, but it might be fixable, I just wanted to get something done and didn't attempt to come up with a more sensible approach (and I have an idea). But giving up on having the type class has a lot of consequences: right now we use the class only to assign meanings to dynamic builtin names and internally in evaluation engines as a way to unlift values, but once unlifting is available to the user, the user surely won't like writing explicit dictionaries instead of just specifying the type of an expression. We can define a user-facing type class, but then we have two slightly distinct type classes that do nearly the same thing, which is weird.
The second one is problematic, as you said. But I think we should be able to define <: as a "natural retract" rather than just a natural transformation, something like type
f <: g = (forall a . f a -> g a, forall a . g a -> Maybe (f a))
.
That would work for terms, I agree.
Doesn't this let us get from
KnownType a uni
toKnownType a uni'
? Imagining we had explicit dictionaries (in quasi-Haskell):extendKnownType :: uni <: uni' -> KnownType uni a -> KnownType uni' a extendKnownType ev KnownType{makeKnown1, readKnown1} = KnownType{ makeKnown a = embedTerm ev $ makeKnown1 a, readKnown eval t = case restrictTerm ev t of Nothing -> fail Just t -> readKnown1 (restrictEval ev eval) t
restrictEval
amounts to restrictTypeScheme
, which amounts to restrictKnownType
, which amounts to embedEval
, which amounts to embedTypeScheme
, which amounts to embedKnownType
, which amounts to restrictEval
. I'm not sure this is a true loop, but it seems to be a loop to me. I break out of this loop by specifying how to embed
/restrict
eval
/TypeScheme
/KnownType
each time the universe gets extended, i.e. I use a non-looping bottom-up approach.
If that was the only problem, I would just go ahead and try to fix the unsafeCoerce
issue. Other things I've described above are basically fine. But we have a bigger task of figuring out whether we want to implicitly unlift values during evaluation or not. Right now we do that and we had to do that previously, but with extensible universes we have an alternative. Instead of defining TypeScheme
as
data TypeScheme uni a r where
TypeSchemeResult :: KnownType a uni => Proxy a -> TypeScheme uni a a
TypeSchemeArrow :: KnownType a uni => Proxy a -> TypeScheme uni b r -> TypeScheme uni (a -> b) r
we can define it as
data TypeScheme uni a r where
TypeSchemeResult :: uni `Includes` a => Proxy a -> TypeScheme uni a a
TypeSchemeArrow :: uni `Includes` a => Proxy a -> TypeScheme uni b r -> TypeScheme uni (a -> b) r
I.e. instead of requiring each type to be liftable and unliftable we can require each type to be in the current universe. Then the burden of turning a term into an unliftable one is on the user, but the user can just use functions or a type class that we'll provide (i.e. we probably can make exactly the same API as with internal unlifting). This allows to disentangle evaluation from unlifting, which greatly simplifies the former (e.g. we no longer need to get KnownType a uni'
from KnownType
). But what about polymorphic builtin types like Sealed
, how do we put them into universes? First instantiate them, so that the type becomes monomorphic? But is the situation with internal unlifting any better in this regard? We also have some quite specific KnownType
instances like the ones for EvaluationResult
and OpaqueTerm
and it's not immediately clear how to handle them with the external unlifting approach. Do we maybe need implicit conversions between Haskell and PLC types (which is what we have currently)? I don't think so, but it's another thing to consider.
So it's just the case that the design space is huge and I'm exploring the options.
It's not a hack to use a newtype to drive instance resolution.
I don't know, I always find it a bit obscure. The nice thing about functions that take dictionaries is that it's back in the realm of Just Functions and Data. I don't look forward to trying to remember what the exciting IsExtended
instance does.
We can define a user-facing type class, but then we have two slightly distinct type classes that do nearly the same thing, which is weird.
I actually think this could be fine.
data KnownTypeDict uni = ...
class KnownType uni a where
dict :: KnownTypeDict uni a
-- other methods defined in terms of dict if we want
A bit of boilerplate yes, but if it makes the internals easier to understand I'll take it.
restrictEval amounts to restrictTypeScheme, which amounts to restrictKnownType, which amounts to embedEval, which amounts to embedTypeScheme, which amounts to embedKnownType, which amounts to restrictEval.
:scream_cat:
I can't hold this in my head, but I see what you're saying :/
I.e. instead of requiring each type to be liftable and unliftable we can require each type to be in the current universe.
Interesting. That does sound like it would simplify things a lot. I can see that quantified types are awkward. Maybe this is a place where thinking of the universe as a type context would be less weird - it could be sensible to add type variables to it. But this is all confusing me greatly tbh.
I don't know, I always find it a bit obscure. The nice thing about functions that take dictionaries is that it's back in the realm of Just Functions and Data. I don't look forward to trying to remember what the exciting
IsExtended
instance does.
I see your point and partially agree with it, but you'd still have to have a function that plays the role of the IsExtended
newtype
wrapper and its KnownType
instance.
On the other hand this:
metaCommaMeaning :: NameMeaning (Extend (a, b) uni)
metaCommaMeaning = NameMeaning sch def where
sch =
Proxy @(InExtended (a, b) uni a) `TypeSchemeArrow`
Proxy @(InExtended (a, b) uni b) `TypeSchemeArrow`
TypeSchemeResult (Proxy @(AsExtension uni (a, b)))
def (InExtended x) (InExtended y) = AsExtension (x, y)
does look a little bit too much.
Maybe this is a place where thinking of the universe as a type context would be less weird - it could be sensible to add type variables to it.
Currently the all
binder does not change the indices of a TypeScheme
. You might think this is weird, but I would rather not mimic target language binders at the type-level in Haskell. That would work in Agda, but in Haskell it's just a huge pain. What we have currently seems to work pretty well, is convenient to use and is not too unsafe (you have to get indices of type variables right, but that's a really small cost to pay for not reflecting any binders at the type level).
But this is all confusing me greatly tbh.
Yeah, I've been working on this for quite a while and it's still intimidating.
I updated TypeScheme
to the version that we talked about in slack:
data TypeScheme uni as r where
TypeSchemeResult :: KnownType a uni => Proxy a -> TypeScheme uni '[] a
TypeSchemeArrow :: KnownType a uni => Proxy a -> TypeScheme uni as r -> TypeScheme uni (a ': as) r
type family FoldType as r where
FoldType '[] r = r
FoldType (a ': as) r = a -> FoldType as r
data NameMeaning uni = forall as r. NameMeaning (TypeScheme uni as r) (FoldType as r)
(except I decided to keep the r
type variable for several reasons. Hence as
is the list of types of arguments of a builtin). It doesn't really seem to affect anything, except makes things slightly nicer.
Renamed DynamicBuiltinNameMeaning
to just NameMeaning
. Probably will perform some more renaming later.
Solved the unsafeCoerce
problem by embedding the entire NameMeaning
instead of just its TypeScheme
. I.e. during the traversal of the TypeScheme
we also adapt the associated Haskell function by wrapping/unwrapping arguments/result. This piece of ASCII art does that:
shiftNameMeaning
:: forall b uni. (Evaluable uni, Typeable b)
=> NameMeaning uni -> NameMeaning (Extend b uni)
shiftNameMeaning (NameMeaning sch x) =
go sch $ \sch' inj -> NameMeaning sch' $ inj x where
go
:: TypeScheme uni as r
-> (forall as' r'.
TypeScheme (Extend b uni) as' r' -> (FoldType as r -> FoldType as' r') -> c)
-> c
go (TypeSchemeResult _) k =
k (TypeSchemeResult Proxy) $ InExtended @b
go (TypeSchemeArrow _ schB) k =
go schB $ \schB' inj -> k
(TypeSchemeArrow Proxy schB')
(\f -> inj . f . unInExtended @b)
go (TypeSchemeAllType var schK) k =
go (schK Proxy) $ \schB' inj -> k (TypeSchemeAllType var $ \_ -> schB') inj
So it seems we now have an MVP, because unlifting of products works and we do not unlift elements separately, but use Haskell's (,)
embedded into a PLC term instead.
So it seems we now have an MVP, because unlifting of products works and we do not unlift elements separately, but use Haskell's (,) embedded into a PLC term instead.
Amazing!
But what about polymorphic builtin types like
Sealed
, how do we put them into universes? First instantiate them, so that the type becomes monomorphic?
Well, we can do the usual thing:
data ExampleUni a where
ExampleUniInt :: ExampleUni Int
ExampleUniList :: ExampleUni a -> ExampleUni [a]
@michaelpj, note that with universes like that we can't use simply a list of types as an index of a data type like we do in Agda.
But defining instances over SomeOf ExampleUni
is not that trivial. I elaborated on the Everywhere
gadget above, but in our case we get:
instance Closed ExampleUni where
type ExampleUni `Everywhere` constr = (constr Int, ???)
...
What's in ???
? We are generic over constr
and thus we can't just hardcode whatever constraint might seem appropriate for a particular type class. E.g. for Eq
that would be forall a. Eq a => Eq [a]
and for an imaginary type class Length
, that allows to get the length of a data structure, it would be forall a. Length [a]
. So between distinct type classes and distinct data structures there are all kinds of possible behaviors and thus we need an additional type class. We can define
class HoldsAt f uni constr where
bringAt :: Proxy f -> proxy constr -> uni a -> (constr (f a) => r) -> r
instance HoldsAt [] ExampleUni Eq where
bringAt _ _ ExampleUniInt r = r
bringAt p proxy (ExampleUniList uni) r = bringAt p proxy uni r
instance Closed ExampleUni where
type ExampleUni `Everywhere` constr = (constr Int, HoldsAt [] ExampleUni constr)
bring _ ExampleUniInt r = r
bring proxy (ExampleUniList uni) r = bringAt (Proxy @[]) proxy uni r
Here HoldsAt
specifies the behavior of a type class for a data structure in a particular universe. I.e. for every universe for every type class for every data structure we need to define an instance. This a cubic amount of instances, which is clearly too much.
What we really want to say is something like that:
instance Closed ExampleUni where
type ExampleUni `Everywhere` constr = (constr Int, forall a. uni `Includes` a => constr [a])
...
which reads as "a constraint over a type from the ExampleUni
universe is satisfied whenever the Int
type satisfies it and for each type a
from the universe the [a]
type satisfies the constraint". However recall that we bring the instance explicitly into the scope using the bring
function and so
forall a. uni `Includes` a => constr [a]
is "morally" true, but we can't expect GHC to realize that constr [a]
holds just because a
is in some universe. So we need an additional type class:
class Through (uni :: * -> *) constr a where
bringStruct :: proxy1 constr -> proxy2 (uni a) -> (constr a => c) -> c
instance (uni `Includes` a, Closed uni, uni `Everywhere` Eq) => Through uni Eq [a] where
bringStruct pConstr (_ :: proxy2 (uni [a])) r = bring pConstr (knownUni @uni @a) r
Here we say that the Eq
instance for [a]
can be derived whenever a
is a type from some universe and all types from that universe implement Eq
(which is a pretty sensible thing to say). Here is another example:
instance (uni `Includes` a, uni `Includes` b, Closed uni, uni `Everywhere` Eq) =>
Through uni Eq (a, b) where
bringStruct pConstr (_ :: proxy2 (uni (a, b))) r =
bring pConstr (knownUni @uni @a) $ bring pConstr (knownUni @uni @b) r
Now that we are generic over universes, we can return to the Closed
instance:
instance Closed ExampleUni where
type ExampleUni `Everywhere` constr = (constr Int, forall a. uni `Includes` a => Through uni constr [a])
...
which reads as before, but now we're not attempting to construct constr [a]
directly, but instead promise that an instance like that can be brought in scope anytime through the uni
universe. It almost works, but GHC is unhappy about the quantified constraint in the associated type family, which we can fix by making a class synonym:
class (forall a. uni `Includes` a => Through uni constr (f a)) => TypeCons uni constr f
instance (forall a. uni `Includes` a => Through uni constr (f a)) => TypeCons uni constr f
instance Closed ExampleUni where
type ExampleUni `Everywhere` constr = (constr Int, TypeCons ExampleUni constr [])
bring _ ExampleUniInt r = r
bring pConstr (ExampleUniList (uni :: ExampleUni a)) r =
reflect uni $ bringStruct pConstr (Proxy @(ExampleUni [a])) r
This finally works. We only need to show what that reflect
is:
class Reflect uni where
reflect :: uni a -> (uni `Includes` a => c) -> c
It's needed because we can't use uni a
directly in a quantified constraint and have to make it an actual constraint, so we do that. The type class essentially enforces that every type from a universe is included in that universe.
Tests work:
instance Reflect ExampleUni where
reflect ExampleUniInt r = r
reflect (ExampleUniList uni) r = reflect uni r
instance ExampleUni `Includes` Int where
knownUni = ExampleUniInt
instance ExampleUni `Includes` a => ExampleUni `Includes` [a] where
knownUni = ExampleUniList knownUni
instance GEq ExampleUni where
ExampleUniInt `geq` ExampleUniInt = Just Refl
ExampleUniList uni1 `geq` ExampleUniList uni2 = do
Refl <- uni1 `geq` uni2
Just Refl
_ `geq` _ = Nothing
test1 = SomeOf ExampleUniInt 5 == SomeOf ExampleUniInt 6
test2 = SomeOf (ExampleUniList ExampleUniInt) [5, 5] == SomeOf (ExampleUniList ExampleUniInt) [5, 5]
This probably can be generalized even more, because those two instances
instance (uni `Includes` a, Closed uni, uni `Everywhere` Eq) => Through uni Eq [a]
instance (uni `Includes` a, Closed uni, uni `Everywhere` Show) => Through uni Show [a]
are basically the same. In fact, we can be sloppy and define
instance ( uni `Includes` a
, Closed uni, uni `Everywhere` constr
, forall b. constr b => constr [b]
) => Through uni constr [a] where
bringStruct pConstr (_ :: proxy2 (uni [a])) r = bring pConstr (knownUni @uni @a) r
and that will cover all already existing constraints over [a]
, including Eq
and Show
.
But what about polymorphic builtin types like
Sealed
, how do we put them into universes?
As shown above the problems with polymorphic builtin types are mostly technical and related to getting constraints right. However for newtype
wrappers like Sealed
we can just ignore these problems. Whatever holds for any type a
from a universe, also holds for Sealed a
, because it's just a wrapper around a
. And therefore we can just ignore Sealed
when it comes to Everywhere
and friends.
We also have some quite specific
KnownType
instances like the ones forEvaluationResult
andOpaqueTerm
and it's not immediately clear how to handle them with the external unlifting approach
Here is how:
data TypeGround uni a where
TypeGroundValue :: uni a -> TypeGround uni a
TypeGroundResult :: uni a -> TypeGround uni (EvaluationResult a)
TypeGroundTerm :: TypeGround uni (OpaqueTerm uni text uniq)
data TypeScheme uni as r where
TypeSchemeResult :: TypeGround uni a -> TypeScheme uni '[] a
TypeSchemeArrow :: TypeGround uni a -> TypeScheme uni as r -> TypeScheme uni (a ': as) r
TypeSchemeAllType
:: (KnownSymbol text, KnownNat uniq)
=> Proxy '(text, uniq)
-> (forall ot. ot ~ OpaqueTerm uni text uniq => TypeGround uni ot -> TypeScheme uni as r)
-> TypeScheme uni as r
So we just hardcode those special types into ground values of TypeSchemes
. Some additional gymnastics is needed to get shifting/unshifting. Basically, we have to change TypeGroundTerm
to
TypeGroundTerm
:: (forall a. uni' a -> uni a)
-> (forall a. uni a -> uni' a)
-> TypeGround uni (OpaqueTerm uni' text uniq)
and then we can write things like
unextend :: Extend b uni a -> uni a
unextend (Original uni) = uni
unextend Extension = error "Can't unshift an 'Extension'"
shiftTypeGround :: TypeGround uni a -> TypeGround (Extend b uni) a
shiftTypeGround (TypeGroundValue uni) = TypeGroundValue $ Original uni
shiftTypeGround (TypeGroundResult uni) = TypeGroundResult $ Original uni
shiftTypeGround (TypeGroundTerm inj ej) = TypeGroundTerm (Original . inj) (ej . unextend)
(I'll replace error
by Nothing
at some point).
And that plays well with the constant application machinery:
makeBuiltin :: TypeGround uni a -> a -> ConstAppResult uni (Term TyName Name uni ())
makeBuiltin (TypeGroundValue uni) x = pure . Constant () $ SomeOf uni x
makeBuiltin (TypeGroundResult _ ) EvaluationFailure = ConstAppFailure
makeBuiltin (TypeGroundResult uni) (EvaluationSuccess x) = pure . Constant () $ SomeOf uni x
makeBuiltin (TypeGroundTerm inj _) (OpaqueTerm term) = pure $ substConstantsTerm inj term
readBuiltin
:: GEq uni => TypeGround uni a -> Term TyName Name uni () -> ConstAppResult uni a
readBuiltin (TypeGroundTerm _ ej) value = pure . OpaqueTerm $ substConstantsTerm ej value
readBuiltin (TypeGroundResult _ ) Error{} = pure EvaluationFailure
readBuiltin (TypeGroundResult uni) value = EvaluationSuccess <$> unliftConstant uni value
readBuiltin (TypeGroundValue uni) value = unliftConstant uni value
The only tricky part here is handling embedding/unembedding opaque terms into/from universes (and I'm not confident I got it right). We no longer need to be parametric over some Monad
m
, because we no longer pass evaluators around and thus do not need to account for the fact that evaluators can be monadic. I.e. this design makes the constant application machinery perfectly pure.
In fact, we do not even need to have a closed data type (TypeGround
) and instead can just have a type class as before, but this time without passing an evaluator to readKnown
, because it's not needed to implement the necessary instances. But I'll stick to the data type for now, we can always add a type class on top of it to get a nice API where the term-level part of a TypeScheme
is derived from its type arguments.
I also realized that the previous approach with a single type class used for both internal and external purposes was a dangerous thing. It would be fine for someone to provide their own KnownType
instance (since the user is given access to this class), but then they can affect how evaluation of a Plutus Core term proceeds, which is a security hole. With a single internal data type not exported to the user, there is no such pitfall.
I haven't yet got something working, but things seem to make sense.
As shown above the problems with polymorphic builtin types are mostly technical and related to getting constraints right. However for newtype wrappers like
Sealed
we can just ignore these problems. Whatever holds for any typea
from a universe, also holds forSealed a
, because it's just a wrapper arounda
. And therefore we can just ignoreSealed
when it comes toEverywhere
and friends.
Haven't finished my thought. So basically we can get away with hardcoding Sealed
without implementing a full machinery for handling polymorphic types. And they are not required for unlifting as we anyway only need to unlift monomorphic values (right?) (like [(Bool, Integer)]
). So I think to get something done it's best to avoid implementing full support for polymorphic values and to cut some corners here and there for the time being.
Dealing with polymorphic types seems to be more or less orthogonal to the core of the approach, so we can add polymorphism later.
I.e. instead of requiring each type to be liftable and unliftable we can require each type to be in the current universe. Then the burden of turning a term into an unliftable one is on the user, but the user can just use functions or a type class that we'll provide (i.e. we probably can make exactly the same API as with internal unlifting).
Unfortunately, this is troubling. Consider unlifting of products: we have a PLC's pair integer bool
and need to get a Haskell's (Integer, Bool)
from it. With unlifting that we have currently, the entire structure is unlifted at once: we do not need to require neither Integer
nor Bool
to be in the current universe. We just extend the universe with a single type, (Integer, Bool)
, and unlift directly into it. But with explicit unlifting we'd have to unlift Integer
and Bool
separately and then combine the resulting values together into a tuple. But this means that we have to add not only (Integer, Bool)
to the universe, but also both Integer
and Bool
.
And from what I see this is not only the problem of unlifting into specialized polymorphic types. For a type of trees with integers in leafs, we'll have to either require integers to be in the current universe or to hardcode the logic of "unlift an integer and immediately wrap it as a leaf, so that there is no need to add integers to the current universe". This monorphic case might be solvable, but the polymorphic one does look irritating.
And note that adding several new types to the current universe is not something that can be done easily. In (a, b)
the types, a
and b
, are independent of each other and so we want to unlift them separately, but that means that we extend the universe in two distinct incompatible ways: with the a
type to unlift into a
and with the b
type to unlift into b
. And then we need to unify those two differently extended universes, which is hard technically. Constraints might sometimes help in cases like that, but see the problems with Includes
and Extend
described above.
In general, whenever the user wants to unlift a data structure, they do not really care about pieces of the data structure. "Just do the right thing".
Except, as described above, there are now two ways to embed Haskell values: deeply into a Scott-encoded data structure and shallowly as a Constant
wrapper around a Haskell value. And therefore we can unlift from a deeply or shallowly embedded value. So the user might actually want to specify which way to unlift values to use in each particular situation.
Also, how should we unlift, say, tuples with manual unlifting? Is it
unlift . PLC.mapTuple unlift unlift
or
Haskell.mapTuple unlift unlift . unlift
or something else? The more I think about manual unlifting, the less I like the idea.
I started implementing manual unlifting just to check it out, but then...
But with explicit unlifting we'd have to unlift
Integer
andBool
separately and then combine the resulting values together into a tuple. But this means that we have to add not only (Integer
,Bool
) to the universe, but also bothInteger
andBool
.
... for unlifting ([(Integer, Bool)], ByteString)
we need to add
Integer
Bool
(Integer, Bool)
[(Integer, Bool)]
([(Integer, Bool)], ByteString)
to the universe. Automatically. While I believe this is doable, I don't think doing this is a good idea. I should just give up on the idea of manual unlifting in the form that I've been thinking of it and consider other opportunities.
The current plan is to implement an MVP (see the latest comment about it after which I started doing research on whether there is a better option). The PR is #1596.
Another problem we have is that with this machinery in place we can embed values in a shallow (just wrap a Haskell value in the
Pure
constructor) and a deep (convert a Haskell list to a PLC list for example -- we already do that) ways and so we need a way to distinguish between the two ways to embed values.I also realized that the previous approach with a single type class used for both internal and external purposes was a dangerous thing. It would be fine for someone to provide their own KnownType instance (since the user is given access to this class), but then they can affect how evaluation of a Plutus Core term proceeds, which is a security hole. With a single internal data type not exported to the user, there is no such pitfall.
There seems to be a surprisingly convenient way of solving these problems: we extend the KnownType
class with a type family that returns a Visibility
:
data Visibility
= Internal
| External
class PrettyKnown a => KnownType a uni where
type VisibilityOf a :: Visibility
type VisibilityOf a = 'External
<...>
and require in the constant application machinery that the Visibility
of a
must be Internal
(so that the user can't extend evaluation with bogus rules). Then every deep conversion between Haskell and PLC must be wrapped in the Deep
newtype
and every shallow embedding of Haskell into PLC must be wrapped in Shallow
. Looks like this:
newtype Shallow a = Shallow
{ unShallow :: a
}
newtype Deep a = Deep
{ unDeep :: a
}
instance (Evaluable uni, uni `Includes` a, PrettyKnown a) => KnownType (Shallow a) uni where
type VisibilityOf (Shallow a) = 'Internal
toTypeAst _ = constantType @a Proxy ()
makeKnown (Shallow x) = constantTerm () x
readKnown (Evaluator eval) term = do
res <- makeRightReflectT $ eval id mempty term
case extractValue res of
Just x -> pure $ Shallow x
_ -> throwError "Not an integer-encoded Char"
-- The @VisibilityOf a ~ 'External@ constraint is not really needed, I suppose,
-- but semantically it seems better to have it.
instance (KnownType a uni, VisibilityOf a ~ 'External) => KnownType (Deep a) uni where
type VisibilityOf (Deep a) = 'Internal
toTypeAst = <...>
makeKnown = makeKnown . unDeep
readKnown eval = fmap Deep . readKnown eval
So we have a single Shallow
instance once and for all: any Haskell value can be embedded into PLC as long as its type is in the universe. While the deep embedding is specific to a particular type, e.g.:
instance Evaluable uni => KnownType Bool uni where
toTypeAst _ = bool
makeKnown b = if b then true else false
readKnown (Evaluator eval) term = <...>
Notice that this VisibilityOf
machinery is invisible to people writing "unlifting" instances like this one.
Name meanings then look like this:
typedLessThanInteger
:: Evaluable uni => TypedBuiltinName uni '[Shallow Integer, Shallow Integer] (Deep Bool)
Here we explicitly specify how to embed values: shallowly or deeply. Now we don't need to guess how to unlift a type -- it's all in the type signatures.
And at the same time for unlifting Deep
is the default and only Shallow
needs to be mentioned explicitly, e.g.:
test :: EvaluationResult (Either Text ((Shallow Integer, ()), (Char, Bool)))
test = readKnownCk @_ @DefaultUni $ makeKnown ((Shallow (5 :: Integer), ()), ('a', True))
However if we ignore the security argument then shallow and deep unlifting can be distinguished by whether a type is wrapped in Shallow
or not. If not, then it's a deep embedding. And no Visibility
is needed (as well as the type family).
I've tried the former approach, it seems to work well, but I'll implement the latter, because it's easy to extend it to the former and I don't want to complicate right now what is already complicated.
I was bored, so I started thinking about this again. The MVP that I was going to implement was solving two problems at once: the extensible builtins problem and the unlifting one. But
Bool
and just adding Bool
to the default universe?) and "any unlifting during evaluation" (which requires adding new types to the universe during evaluation, which is what causes all the problems) and that sweet spot is "only allow unlifting that doesn't require adding new types to the universe" (i.e. this is what we have now for the hardcoded universe). This way we still can lift and unlift things like ()
and Bool
(even though it's a bit hacky), all the internal machinery (KnownType
instances for EvaluationResult
, OpaqueTerm
, etc) works the same way it works currently (which is good).However it's not clear whether we can untangle evaluation from unlifting. I tried that previously and failed, but there might be alternative approaches. In any case, first adding extensible builtins without breaking anything and only then considering various approaches to the harder problem of unlifting seems like a good plan. It's conservative and a huge step forward at the same time.
that sweet spot is "only allow unlifting that doesn't require adding new types to the universe"
After several hours of thinking it struck me that we can extend universes in a different way. Instead of using
data Extend b uni a where
Extension :: Extend b uni b
Original :: uni a -> Extend b uni a
we can make universes higher-order: uni :: (* -> *) -> * -> *
, where the first argument of uni
is an extension of it (i.e. each uni
has a constructor with the following type: ext a -> uni ext a
) and the last one remains the same. This way we can define all built-in names in the forall ext. uni ext
universe and that will allow us later to extend this universe by instantiating ext
to some particular subuniverse without disturbing any built-in names that are already in scope, because their types explicitly say that any ext
is fine.
This means that we will be able to add new types to the universe during unlifting, but we won't be able to add new built-in names that reference those types. So we still have a restriction, but it's relaxed now and unlifting of things like ()
and Bool
becomes much nicer.
Having that ext
parameter is also good, because we can try to make it unordered (the unordered-effects
blog post that I wrote is a direct consequence of this research on dynamic builtins), since we always extend universes with a single type at a time (no type constructors, etc). While generally universes encode infinite number of types (e.g. Integer
, [Integer]
, [[Integer]]
...), so trying to make those unordered is completely hopeless.
Anyway, we should start with first-order universes to get extensible builtins. Making universes higher-order later shouldn't pose any problems.
By the way, we currently do support dynamic extension of the set of built-in names during unlifting, but we don't actually need it. We needed it earlier and we will need it in future, but right now this is a feature that is not used anywhere, I think.
This is a sub-issue of #487 that is specifically about dynamic builtin types and their future implementation what is meant to allow us to embed arbitrary Haskell values into the AST of Plutus Core in a well-controlled manner.
We want to extend the type of
Term
with two additional variables:dyn
andtydyn
.The idea is to extend
Term
with thedyn
type variable is a really old one. Quoting a YouTrack thread (CGP-374):The idea to add
tydyn
is a bit younger. Quoting an e-mail thread:However if we want to be generic over dynamic builtins and package them together freely and package those packages together as well (i.e. have extensible dynamic builtins), then we can't have a single
Universe
. Do we need to solve an existential version of the expression problem? Sounds fun.