Dynamic builtin types - Githubissues

IntersectMBO / plutus

The Plutus language implementation and tools

Apache License 2.0

1.57k stars 479 forks source link

Dynamic builtin types #581

Closed effectfully closed 3 years ago

effectfully commented 5 years ago

This is a sub-issue of #487 that is specifically about dynamic builtin types and their future implementation what is meant to allow us to embed arbitrary Haskell values into the AST of Plutus Core in a well-controlled manner.

We want to extend the type of Term with two additional variables: dyn and tydyn.

The idea is to extend Term with the dyn type variable is a really old one. Quoting a YouTrack thread (CGP-374):

An argument in favor of the additional dyn type variable: one can express chain-generality with it: any expression of type forall dyn. Term TyName Name dyn () is chain-generic.

Moreover, we can instantiate dyn to distinct types (as opposed to single DynamicBuiltinName which is just Text under the hood), this allows to write chain-specific code: if some chain adds only a couple of new built-ins, you do want to make an ADT out of them and parameterize the AST by that ADT rather than allowing to write anything in that Text field. This seems like a killer feature actually, because it also allows to get rid of the DynamicBuiltinNameNotFoundError error, because if you have a closed ADT, you can simply parameterize various procedures (type checking, evaluation) by a function that always returns an appropriate thing for each possible inhabitant of a closed ADT, while there are infinitely many inhabitants of Text.

The idea to add tydyn is a bit younger. Quoting an e-mail thread:

I even believe we can get away without Haskell's Dynamic which requires Typeable which is kinda a hack even after the redesign

This likely requires tydyn in addition to dyn, though. Because we want extensions to built-in types to be existentially typed as well, so that we put in dyn an existentially typed code of some type which index is the exact Haskell type this dynamic built-in type represents and a value of that Haskell type. Something like this:
data Universe sem where
    UniverseString :: Universe String
    UniverseDouble :: Universe Double

tydyn := exists sem. Universe sem
dyn   := exists sem. (Universe sem) sem
Then we put tydyn into the typed built-ins machinery and the only thing we need to decide while applying built-ins to each other is whether we can conclude that sem1 ~ sem2 having u1 :: Universe sem1 and u2 :: Universe sem2. I.e. we just need a GEq instance for Universe. Looks doable.

Note that this is required in order to safely evaluate things without unsafeCoerce and various junk. The type checker should be free from existentials. And I do no think we can put values of arbitrary types into Term without some kind of existential typing.

However if we want to be generic over dynamic builtins and package them together freely and package those packages together as well (i.e. have extensible dynamic builtins), then we can't have a single Universe. Do we need to solve an existential version of the expression problem? Sounds fun.

effectfully commented 5 years ago

It seems we need the following builtin:

convert : all a b. a -> maybe b

See this for details.

But we do not allow parametric polymorphism in dynamic builtins right now and adding it is not that straightforward. Here are the problems:

Constant application must not throw away types, which it does currently (because we thought we'd never need type-dependent computations). It's not a big deal to keep types, however I've been willing to try a completely different constant application machinery since PlutusFest where I realized that there probably exists a way to get the benefits of saturated builtins without their issues (which I've never described, because didn't want to bring this huge and not super important topic up). So I'd rather completely rewrite the constant application machinery to what I think might be a better version rather than tweak the current version which I hope is going to vanish.
Let's say we have this dynamic builtin name: monoconst : all a. a -> a -> a which takes two arguments and returns the first one. How are we going to evaluate monoconst {list bool} nil nil? Or course, in order to evaluate nil we need to use the dynamic builtin types machinery, namely the readDynamicBuiltin function. That's not a problem, we already do that. But how do we know which type we're expecting to get? Looking at the example we see that it's supposed to be the denotation of list bool, i.e. List Bool, but how do we program that? We need to go from a PLC type to a Haskell type. Right now we do the opposite via the dynamic builtin types machinery, i.e. go from a Haskell type to a PLC type. But the other direction looks complicated. Not because it requires us going from a weakly typed thing to a strongly typed thing (we do that all the time, see Language.PlutusCore.Constant.Name.withTypedBuiltinName for example), but because how would you actually realize that list bool is a list of booleans? The list bool type in PLC is a rather big AST, do we need to parse ASTs in order to figure out what the hell they actually mean?

But there is another thing we might do: since we have parametricity in the metalanguage, we can just not use the dynamic builtins machinery for parametric things. We can pass the AST directly whenever it's the case that we can't do anything with the value it represents due to parametricity! This means that instead of evaluating the expression from the above like that (pseudocode):

  Plc.monoconst {list bool} nil nil
~ make (Haskell.monoconst @(List Bool) (read nil) (read nil))
~ make (Haskell.monoconst @(List Bool) [] [])
~ make []
~ nil

we can evaluate it like that:

  Plc.monoconst {list bool} nil nil
~ Haskell.monoconst @Term nil nil
~ nil

The trade-off is that previously we've been always checking at runtime that types don't go wrong (some extra paranoia is always a good thing), while in the above we aren't checking that the type of nil is list bool. But maybe we can fix this as well by calling the actual type checker at runtime. Though, I guess we can live without type checking things at runtime for now.

Also we need to handle dynamic builtin names which are not completely parametric like this one for example:

reverse : all a. list a -> list a

It's supposed to compute like this:

  PLC.reverse {bool} (cons true (cons false nil))
~ make (Haskell.reverse @Term (read @(List Term) (cons true (cons false nil))))
~ make (Haskell.reverse @Term [true, false])
~ make [false, true]
~ cons false (cons true nil)

I realized that the current way I handle sizes in dynamic builtins and constant applications is outdated. I believe we can do much better than that. Quoting #487:

I've recently (in #642) removed TypedBuiltinBool as a special case and realized that we can remove all special cases, even integers and bytestrings, like that. This is needed, because right now we can throw an error from a builtin name that normally returns an Bool, but not from one that returns an Integer, which is a silly limitation. The challenge however is in handling sizes, which I don't know how to approach at the moment.

So in the end I feel like it's easier and more sensible to just rewrite the whole dynamic builtin types and constant application machineries. This requires time, but certainly less time than handling all the described issues separately, because they affect big chunks of code and interfere.

michaelpj commented 5 years ago

Can you easily describe the changes you want to make, or would it be easier to just write it?

It is a bit awkward the the current mechanism for doing evaluation in Haskell doesn't work very well once we start wanting to pull types across.

Maybe this is a horrible idea, but we could have some builtins that don't use the typed builtin mechanism, but rather operate directly on the PLC terms. I think this makes particular sense for convert, which is after all really a meta-language primitive, in that it's going to invoke the typechecker, which operates on terms anyway.

effectfully commented 5 years ago

Can you easily describe the changes you want to make, or would it be easier to just write it?

I'm not yet sure what the changes should be.

It is a bit awkward the the current mechanism for doing evaluation in Haskell doesn't work very well once we start wanting to pull types across.

I really need to try implementing stuff before I can be sure it's awkward. What I described above makes sense to me and I'd be satisfied by a solution that handles those difficulties and doesn't add any weirdness.

Maybe this is a horrible idea, but we could have some builtins that don't use the typed builtin mechanism, but rather operate directly on the PLC terms. I think this makes particular sense for convert, which is after all really a meta-language primitive, in that it's going to invoke the typechecker, which operates on terms anyway.

I do not think it's a horrible idea at all and I agree that it makes perfect sense for convert. However there can be other builtins that require parametric polymorphism, like the reverse one mentioned above. So if we can make the machinery work once and for all builtins, that would be great. If we can't or it's too heavy/dirty, I'll consider other options, of course.

I'm considering right now changing the whole dynamic builtins story to something simpler.

effectfully commented 5 years ago

This is a sub-issue of #487 that is specifically about dynamic builtin types and their future implementation what is meant to allow us to embed arbitrary Haskell values into the AST of Plutus Core in a well-controlled manner.

This seems obsolete, because in #724 and #733 we managed to implement decoding of values without this heavy feature.

I've been willing to try a completely different constant application machinery since PlutusFest where I realized that there probably exists a way to get the benefits of saturated builtins without their issues

I tried that and it didn't work.

But there is another thing we might do: since we have parametricity in the metalanguage, we can just not use the dynamic builtins machinery for parametric things. We can pass the AST directly whenever it's the case that we can't do anything with the value it represents due to parametricity!

This is my next focus.

I realized that the current way I handle sizes in dynamic builtins and constant applications is outdated. I believe we can do much better than that.

Still believe that, but there are problems (that we discussed by email) and this development stream is suspended, because the cast thing (previously called convert) is more important.

michaelpj commented 5 years ago

This seems obsolete, because in #724 and #733 we managed to implement decoding of values without this heavy feature.

I think this is right. I think we probably still want dynamic builtin types so we can have e.g. efficient strings when we want them. And I still think it would be good to be able to have no static builtins at all.

effectfully commented 5 years ago

I think we probably still want dynamic builtin types so we can have e.g. efficient strings when we want them. And I still think it would be good to be able to have no static builtins at all.

Good point, I agree.

effectfully commented 5 years ago

Implemented parametric polymorphism in #751 the way it's described above.

An interesting thought has occurred to me. Right now we use some weirdness in order to unlift products and especially sums. And unlifting of lists is not super nice either. What if we add a separate evaluator (and maybe unify it with the CEK machine at some point) that looks roughly like this (pseudocode):

unliftEvaluate :: Term tyname name res () -> EvaluationResult res

(where res specifies the expected resulting type, hence we need to extend the AST probably as well)? The idea here is that when we unlift a data type, we always know a value of exactly what type we want to get in the end. But there is also one very important thing that we know too: if we apply an Unwrapped Plutus Core data type to a bunch of Haskell constructors (wrapped somehow into the Plutus Core AST), then we know that any call to the constructor is a final call and we won't need to compute anything else (it's the same logic as with single pattern matches that allow to unlift recursive data structures, because recursion is implicit in type class resolution). This means that we do not need to lift the result of such an application back to Plutus Core like we do normally, instead we can just return it from the abstract machine. And this does not require any Typeable nonsense, because when we unlift a data type, all supplied constructors are required to construct the same data type, so we exactly know the resulting type.

This technique would allow to uniformly and very easily handle unlifting of arbitrary data structures. Need to unlift a Scott-encoded data type? Just apply it to its Haskell constructors and evaluate. The technique is a little bit heavy, but may worth it.

On the other hand it might be redundant if we find an acceptable way to embed the entire Haskell into Plutus Core, because we then can just add a constructor ExpectedResult and match on it in order to ensure that the result of a computation has the expected type.

michaelpj commented 5 years ago

I'm still not sure I understand this suggestion. I think you're saying something like: for a value that is of a datatype, we can not just pattern match on it, we can fold it. And a fold can be type-changing - we can have a different output type, so we can fold into a Haskell value without having to go back into PLC.

But I think this still sticks on the problem that brought us to embedding Haskell into PLC: how do we apply our PLC fold/pattern match to a Haskell constructor? So far we've discussed:

Do everything in PLC: requires embedding the Haskell into PLC
Do everything in Haskell: requires embedding the PLC into Haskell

I'm not sure whether you're gesturing at another way of solving this or what?

effectfully commented 5 years ago

I think you're saying something like: for a value that is of a datatype, we can not just pattern match on it, we can fold it

No. It's one of the key components: we literally need only to perform a single pattern match, we do not need any folds. You've seen this note, right? We can use the same trick for direct unlifting of data types that doesn't go through a generic representation.

Spoiler: I'm not sure that what I wrote above would work. But let me show how it could work, cause in the end I'll implement something similar.

Imagine a PLC list in normal form (I will ignore type instantiations):

cons 1 (cons 2 (cons 3 nil))

in order to convert it to the corresponding Haskell list we need neither folds, nor to embed Haskell into PLC. Here is what we do: we unwrap and apply this term to two dynamic names: $nil (standing for Haskell's []) and $cons (standing for Haskell's (:)). We do not need to embed the two Haskell constructors into PLC as we can just use dynamic names the way we do it in other places. The embedding is only required if we want to construct a Haskell value inside PLC and manipulate it somehow inside PLC.

Hence

how do we apply our PLC fold/pattern match to a Haskell constructor?

that's not a problem at all, we can use dynamic names. The problem is how to deal with the result that the constructor returns once a constant application is computed. But we don't have this problem.

Okay, so we apply the PLC list to dynamic names standing for Haskell constructors and get something like this:

unwrap (cons 1 (cons 2 (cons 3 nil))) $nil $cons

In a few steps it reduces to

$cons 1 (cons 2 (cons 3 nil))

Note that the first $cons stands for a meta-level (:) now.

Then our evaluation engine unlifts 1 and cons 2 (cons 3 nil), transforms $cons into its denotation and applies the denotation to the arguments. I.e. the expression reduces to

1 : [2, 3]

Note that the tail of the list gets unlifted as well even though we performed only a single pattern match! That's due to the nature of the constant application machinery that always unlifts all arguments passed to a builtin (modulo the recently implemented parametric polymorphism).

What I've described so far is how things already work. However instead of just returning [1, 2, 3] we currently transform that list back to PLC. Again, due to the nature of the constant application machinery that always converts the result of a constant application back to PLC. But we don't want to do that! We just want to return the result. And we know what type it's supposed to have, so we don't need any Typeable, unsafeCoerce or anything of this kind.

Hence what I proposed above is to be able to communicate to the evaluator for what builtins the evaluator should not convert the result of a constant application back to PLC and instead should return it as is.

... which would work for normal-form terms, but with non-normal terms we may actually want to do something with the resulting Haskell list on the PLC side. For example we may have

(\x y -> y) (cons 1 nil) (cons 2 nil)

And in that case we do not want to blindly return [1] as the firstly computed list, instead we want to actually evaluate the whole thing and get [2].

However, the terms that we get there are in fact in WHNF. And it looks like WHNF might be enough to make things work, but I'm not really sure. Need more thinking and I'll just try the idea out at some point, I guess.

Makes sense?

michaelpj commented 5 years ago

You've seen this note, right?

Yes, but there we're using typeclass dispatch on the Haskell side to decide what function to call on the results. Is the idea to do the same here? If so I guess I'm confused as to how it's an "evaluator" rather than just the unlifting we have already?

... which would work for normal-form terms, but with non-normal terms we may actually want to do something with the resulting Haskell list on the PLC side

Yes, this seems to be an issue. Aren't you proposing something like normalize-and-then-unlift? Isn't that what we already do?

effectfully commented 5 years ago

Yes, but there we're using typeclass dispatch on the Haskell side to decide what function to call on the results. Is the idea to do the same here? If so I guess I'm confused as to how it's an "evaluator" rather than just the unlifting we have already?

What we have currently evaluates like like this:

readDynamicBuiltinCek (cons 1 (cons 2 nil))

case Right (1, readDynamicBuiltinCek cons 2 nil) of
    Left () -> []
    Right (x, xs) -> x : xs

1 : readDynamicBuiltinCek cons 2 nil

1 : case Right (2, readDynamicBuiltinCek nil) of
    Left () -> []
    Right (x, xs) -> x : xs

1 : 2 : case Left () of
    Left () -> []
    Right (x, xs) -> x : xs

1 : 2 : []

[1, 2]

What we want is:

readDynamicBuiltinCek (cons 1 (cons 2 nil))

1 : readDynamicBuiltinCek (cons 2 nil)

1 : 2 : readDynamicBuiltinCek nil

1 : 2 : []

[1, 2]

I.e. basically the same, but without the Either and (,) indirections.

We can achieve that by simply interpreting each PLC's cons and nil as Haskell's (:) and [] respectively. The only thing that we need is a way to convince an evaluator not to wrap constant application results back into PLC, but rather return Haskell values directly.

Aren't you proposing something like normalize-and-then-unlift? Isn't that what we already do?

Right now we only reduce up to WHNF inside the abstract machines. If we attempt to perform some reductions separately and then unlift a PLC value, bad things will happen. WHNF seems enough, but maybe it's not.

Thinking about it again, it's probably the case that embedding Haskell into PLC worth a try. But regardless of what we do, I believe we don't need folds, it's an obsolete idea and we can have something much simpler.

Also, at some point I tried to define a tail-recursive function to map a tree:

data Tree a
    = Leaf a
    | Fork a (Tree a) (Tree a)

instance Functor Tree where
  fmap f = go [] where
      go stack (Leaf x)     = roll stack (Leaf (f x))
      go stack (Fork x l r) = go (Right (x, r) : stack) l

      roll []                     t = t
      roll (Right (x, r) : stack) t = go (Left (t, x) : stack) r
      roll (Left  (l, x) : stack) t = roll stack (Fork (f x) l t)

This simulates construction of a tree using the left fold. It took me an entire evening (strangely, I was unable to find a tail-recursive map over trees neither in the OCaml, nor the ML worlds) and it would be hard to generalize this to other interestingly shaped data types. Unless I missed something, we're lucky we can avoid constructing data types in accumulators.

(Update: tried googling again and found a nice approach to tail-recursive maps over trees in this Scala answer).

effectfully commented 5 years ago

Here are some thoughts that I have for now.

First of all, we do not want to change extensible builtin names yet. Let them be Text for now, we can change this later as it does not allow us to do anything new and important, but requires to amend a lot of stuff thought the entire codebase.

So what we want to do currently is to extend Plutus Core with arbitrary Haskell types in a well-controlled manner. I envision this as follows:

extend the Term data type as

data Term tyname name ext ann
    = ...
    | Pure ext

define terms with extensible types:

type ExtTerm tyname name uni ann = Term tyname name (exists a. (uni a, a)) ann

Where uni is a universe of additional types and exists is pseudocode for existential quantification. This way we can put arbitrary a into the AST of PLC provided a is included in the universe of additional types. We can't just define ExtTerm directly and avoid extending Term with the generic Pure altogether, because existentials often break deriving and it's better to preserve the ability to derive various instances, provided we do not pay much for this (I think we don't).

Abstract machines now should receive an additional constraint, e.g.

evaluateCk :: GEq uni => ExtTerm tyname name uni ann -> ExtTerm tyname name uni ann

where GEq is this. We need this constraint in order to check whether a builtin name can be applied to an argument whose type lies in uni (we also need to decide how to handle parametric polymorphism).

The type schemes machinery needs to be rethought. It should be something like

-- We probably want to use that together with `fastsum`.
-- But also allow @Either@ and use type families for computing the index of a type,
-- because we want to extend @uni@ in order to unlift values.
class uni `Includes` a where
    uniVal :: uni a

data TypeScheme uni a r where
    TypeSchemeResult :: uni `Includes` a => Proxy a -> TypeScheme a a
    TypeSchemeArrow  :: uni `Includes` a => Proxy a -> TypeScheme b r -> TypeScheme (a -> b) r
    ...

For example,

typedTakeByteString
    :: (uni `Includes` Integer, uni `Includes` ByteString)
    => TypedBuiltinName uni (Integer -> ByteString -> ByteString)
typedTakeByteString =
    TypedBuiltinName TakeByteString $
        Proxy `TypeSchemeArrow` Proxy `TypeSchemeArrow` TypeSchemeResult Proxy

Another problem we have is that with this machinery in place we can embed values in a shallow (just wrap a Haskell value in the Pure constructor) and a deep (convert a Haskell list to a PLC list for example -- we already do that) ways and so we need a way to distinguish between the two ways to embed values.

Not sure whether this is going to work, but at least we have something to start with.

effectfully commented 5 years ago

Started implementing this stuff. I decided not to go with the approach described here:

A funny thought: we do not really need to change the AST in order to be able to embed arbitrary Haskell into Plutus Core. We can just carry an environment of Haskell values together with a Plutus Core term which references values in the environment by their indices (say, in the Var constructor, but it's better to have a separate MetaVar constructor for this). At the very least this should allow us to play with the ideas without needing to fix 10000 errors that adding additional parameters to Term would cause.

because it's an additional indirection and most of the 10000 errors can be fixed by find-and-replace.

Now that I'm looking at it closely, I feel like we'll have to rethink quite a lot of stuff. E.g. previously we needed to pass evaluators around for unlifting of values (and we currently normally do unlifting of values during regular evaluation of PLC expressions deep inside the constant application machinery). But with proper extensible builtins, we no longer need that! Extensible builtins will allow to unlift a deeply embedded value (e.g. PLC list) to a shallowly embedded value (Haskell list wrapped as a PLC value) and then we can extract ("unembed") the result by means of simple pattern matching.

Here is how that would look for lists:

$nil : all a. metalist a
$cons : all a. a -> metalist a -> metalist a

unliftList : all a. list a -> metalist a
unliftList {a} xs = unwrap xs {metalist a} $nil (\x xs -> $cons x (unliftList xs))

The problem here however is that unliftList unlifts only the spine of a list, but not the actual elements. In order to unlift the elements, we'll also need to map a function that unlifts a single element over the PLC list beforehand (or map such a function over the Haskell list afterwards). Or just directly write

unliftList : all a b. (a -> b) -> list a -> metalist a
unliftList {a} unliftA xs = unwrap xs {metalist a} $nil (\x xs -> $cons (unliftA x) (unliftList unliftA xs))

Typeclasses are perfectly suitable for this and we use them currently, but if we decide to simplify things and not to pass evaluators around (which also allows to make the constant application machinery pure again) and keep most of the unlifting logic on the Plutus side, then we don't have type classes there and need to deal with that.

It is not clear to me at the moment whether we can get the benefits of the current approach, but do not inherit its complexity. But some unlifting of values seems to be doable without the complexity that we currently have.

I'll investigate further and probably I'll just implement things in two stages:

add extensible builtlins
remove the intricate pass-evaluators-around machinery provided it's no longer required

effectfully commented 5 years ago

We can't just define ExtTerm directly and avoid extending Term with the generic Pure altogether, because existentials often break deriving and it's better to preserve the ability to derive various instances, provided we do not pay much for this (I think we don't).

We actually do. The thing is, we want to parameterize both Type and Term by the same uni like this:

data Type tyname uni ann
    = forall a. TyMeta ann (uni a)
    | ...

data Term tyname name uni ann
    = forall a. Meta ann (uni a) a
    | ...

This way we have representations for embedded types and embedded values in PLC.

For this, we can parameterize both Type and Term by distinct type variables (tydyn and dyn respectively) and instantiate them in different ways later to get extensible builtins (I elaborate on this in the first message in this thread), but we already have plenty of type variables. There is a sensible alternative:

data Some f = forall a. Some (f a)
data SomeOf f = forall a. SomeOf (f a) a

data Type tyname uni ann
    = TyMeta ann (Some uni)
    | ...

data Term tyname name uni ann
    = Meta ann (SomeOf uni)
    | ...

This way we hide all the existentiality into two dedicated data types (Some and SomeOf), thus deriving for Type and Term works perfectly well, provided you first define the relevant instances for Some and SomeOf (with the two additional type variables approach we can define instances for Some and SomeOf after deriving them for Type and Term, but this doesn't seem to be a huge benefit).

Hiding existentiality in bespoke data types is what I use in the other project and it works well, so I'm going to try that out with Plutus.

effectfully commented 5 years ago

Unsurprisingly, we run into difficulties while trying to define various instances in this existential setting. For example, in order to derive NFData for Term, we need to define NFData for SomeOf uni, which means that we need to rnf the a from SomeOf uni, but that a is existentially quantified and so we can't just require it to have an NFData instance. However, we can require each type from uni to have that instance and then any particular a will have it as well. Looks like this:

class KnownUni uni where
    type Everywhere uni (constr :: * -> Constraint) :: Constraint
    bring :: uni `Everywhere` constr => proxy constr -> uni a -> (constr a => r) -> r

bringApply
    :: (KnownUni uni, uni `Everywhere` constr)
    => Proxy constr -> (forall a. constr a => a -> r) -> SomeOf uni -> r
bringApply proxy f (SomeOf uni x) = bring proxy uni $ f x

instance (KnownUni uni, uni `Everywhere` NFData) => NFData (SomeOf uni) where
    rnf = bringApply (Proxy @NFData) rnf

And an example:

data ExampleUni a where
    ExampleUniBool :: ExampleUni Bool
    ExampleUniInt  :: ExampleUni Int

instance KnownUni ExampleUni where
    type Everywhere ExampleUni constr = (constr Bool, constr Int)
    bring _ ExampleUniBool = id
    bring _ ExampleUniInt  = id

example = rnf $ SomeOf ExampleUniInt 42

In short:

Everywhere applies a constraint to each of the types that a universe has and collects the constraints into a single final constraint
bring ensures that each type from the universe has an appropriate instance, provided the entire universe satisfies the final constraint
bringApply brings a particular instance in scope (relying on bring) and then applies a function that requires this instances to the value stored in SomeOf

Note that the KnownUni machinery is a rather generic one and so it's suitable for other type classes, not just NFData. This seems to be the most sensible approach regardless of whether we choose to use the parameterize-both-Type-and-Term-by-uni approach or the parameterize-Type-by-tycon-and-Term-by-con one.

E.g. an Eq instance for SomeOf uni can be defined like this:

instance (GEq uni, KnownUni uni, uni `Everywhere` Eq) => Eq (SomeOf uni) where
    SomeOf uni1 x1 == SomeOf uni2 x2 =
        case uni1 `geq` uni2 of
            Nothing   -> False
            Just Refl -> bring (Proxy @Eq) uni1 (x1 == x2)

michaelpj commented 5 years ago

That sounds sensible, as usual I'm impressed you managed to come up with this! It's good that this seems generic, since we also need to handle things like Serialise and friends.

effectfully commented 5 years ago

Seems to work. Deriving for Type & Term works as well.

michaelpj commented 5 years ago

While looking at the sealed builtin, I noticed that we need fancier constants too. A value of type sealed a has to be some constant. So the sealed constants need to be able to have integers, bytestrings, arbitrary PLC types in them.

Roman tells me this should work with the approach he's developing, but just writing this down.

effectfully commented 5 years ago

One of the things we want to do is extending the universe a term is defined in. Because that is how we're going to handle unlifting -- by adding the type, that we're unlifting into, into the current universe, computing on the PLC side and then extracting the Haskell result wrapped as a PLC value. However there is one annoyance: all the constraints that held in the original universe, do not hold in the extended universe without additional trickery. Consider

-- | Convert a @nat@ to an @integer@.
--
-- > foldNat {integer} (addInteger 1) 1
natToInteger :: (TermLike term TyName Name uni, uni `Includes` Integer) => term ()
natToInteger = ...

Here we define a term that can include integers, which is specified by

uni `Includes` Integer

Then having

shiftConstants :: Term tyname name uni ann -> Term tyname name (uni :+: ext) ann

which extends the universe with ext (:+: comes from GHC.Generics), we can embed natToInteger :: Term TyName Name uni () into Term TyName Name (uni :+: ext) (), but we won't be able to, say, apply this embedded term to some another term that contains integers, because

(uni :+: ext) `Includes` Integer

simply doesn't hold. It is of course morally true that the extended universe contains all the types from the original universe, but convincing GHC is not an easy thing here, because the extended universe contains not only those types, but also the types from the extension and this means we need an instance like

instance (uni `Includes` a) OR (ext `Includes` a) => (uni :+: ext) `Includes` a

and there is no OR for constraints. It is probably possible to perform instance search manually by using type families or some other weirdness. Or something like Data.Reflection might work, if we're willing just to generate instances on the fly:

withExtendedUniL :: uni `Includes` a => ((uni :+: ext) `Includes` a => c) -> c
withExtendedUniR :: ext `Includes` a => ((uni :+: ext) `Includes` a => c) -> c

But this is way too heavy and hacky.

One thing that we can try for now is to always extend universes with a single new type and don't let this type participate in instance seach and instead always search in the original universe. Something like this:


data Extend b uni a where
    Extension :: Extend b uni b
    Original  :: uni a -> Extend b uni a

instance uni `Includes` a => Extend b uni `Includes` a where
    knownUni = Original knownUni

That might be enough for our particular use case, because we only want to dynamically extend universes for unlifting purposes, but that happens under the hood and is not visible to the user, who always deals with a single blockchain-specific universe of types.

effectfully commented 5 years ago

One thing that we can try for now is to always extend universes with a single new type and don't let this type participate in instance seach and instead always search in the original universe.

Did that and it seems to work well.

effectfully commented 5 years ago

While diving into the details of unlifting, I realized there is a small problem. Consider unlifting of tuples. We want to apply a PLC term representing a tuple to a dynamic builtin name representing the Haskell's (,) constructor. Note that since we know all the types, (,) doesn't have to be polymorphic, it's just (,) :: A -> B -> (A, B) for some particular A and B (in our generic case they're just free variables). Now since for stage 1 I'm implementing the old machinery in terms of the new one (to avoid breaking everything and because I'm not sure we'd no longer need the old machinery), the values of types A and B have to be converted from PLC to Haskell via the KnownType machinery. And since we're constructing an actual Haskell tuple and then wrap it as a PLC value, we have to add the type of tuples to our current universe of types. But this means that we convert from PLC to Haskell in one universe and then embed Haskell in PLC in another universe (the original universe extended with tuples). Which is a pretty annoying technical problem of how to unlift values in one universe and then wrap them in some another one.

For now I'm just going to pretend we only have to deal with the extended universe and throw an error whenever something goes wrong (we already throw a lot of errors (purely) in that code, so it's not a big deal) and then we'll see whether this can be improved in stage 2.

effectfully commented 5 years ago

Another stupid obstacle: currently KnownType.readKnown has the following type signature:

readKnown :: Monad m => Evaluator Term m -> Term TyName Name () -> ReflectT m a

Once we've added the uni type parameter there are two options:

readKnown :: Monad m => Evaluator Term m -> Term TyName Name uni () -> ReflectT m a

and

readKnown :: Monad m => Evaluator Term uni m -> Term TyName Name uni () -> ReflectT m a

I.e. the choice is between Evaluator being able to handle any universe or the specific universe that the term we're unlifting is defined in. However the latter choice does not quite work, because we often want to extend the universe with an additional type (e.g. in order to unlift the Scott-encoded PLC's unit, we apply that term to Haskell's () and then match on the result in order to ensure we got () back).

So we're left with Evaluator being universe-polymorphic (i.e. Evaluator Term m). However we still have static builtins and that means we have to hardcode uni to contain all the types that are currently static (but this time via constraints rather than on the AST level). This is fine as a first approximation, but we run into a more annoying problem, consider one definition from plutus-core-interpreter:

evaluateInCekM :: EvaluateConstApp (Either CekMachineException) a -> CekM (ConstAppResult a)
evaluateInCekM a =
    ReaderT $ \cekEnv ->
        let eval means' = evaluateCekCatchIn $ cekEnv & cekEnvMeans %~ mappend means'
            in runEvaluateConstApp eval a

Here we wrap the CEK machine as an evaluator and use that for unlifting values (which we do during the execution of the CEK machine). But we also extend the environment used for unlifting (means') with the environment of dynamic builtin names that are currently in scope (cekEnv). And we can't typify that when Evaluator is polymorphic over uni (i.e. there is forall uni. ... inside Evaluator), because it has to have the same uni that we're currently in, otherwise it's impossible to merge the two environments, since they are defined in distinct universes.

This asks for something like Kripke semantics for evaluators. Instead of being parametric over environments or hardcoding environments we can say that any environment is fine as long as it's an extension of the original environment. But then it's not immediately clear whether it's possible to define the abstract machines in such an environment-extending way.

Not a showstopper, but requires more thinking.

michaelpj commented 5 years ago

However the latter choice does not quite work, because we often want to extend the universe with an additional type (e.g. in order to unlift the Scott-encoded PLC's unit, we apply that term to Haskell's () and then match on the result in order to ensure we got () back

Don't we always need just ()? Or perhaps we could get away with integer, if we wanted to use integers to e.g. tell us which branch we've got.

So could it be this?

readKnown :: Monad m => Evaluator Term (uni `with` integer) m -> Term TyName Name uni () -> ReflectT m a

michaelpj commented 5 years ago

Also, I'm a little confused since its sounds like you just discuss the case where we parameterize Evaluator by universes. I'm not sure what goes wrong in the case where we have just Evaluator Term m.

effectfully commented 5 years ago

Don't we always need just ()? Or perhaps we could get away with integer, if we wanted to use integers to e.g. tell us which branch we've got.

So could it be this?

Yes, but that's what we were doing previously and it doesn't work for tuples for example, when you actually want to add tuples to the universe and add (,) as a dynamic builtin name, so that you can construct a Haskell tuple on the PLC side and extract it later.

Also, I'm a little confused since its sounds like you just discuss the case where we parameterize Evaluator by universes. I'm not sure what goes wrong in the case where we have just Evaluator Term m.

That's because I stupidly used "parametric over universe" instead of "universe-polymorphic". That evaluateInCekM problem is about Evaluator Term m. I'll edit the post.

michaelpj commented 5 years ago

Yes, but that's what we were doing previously and it doesn't work for tuples for example, when you actually want to add tuples to the universe and add (,) as a dynamic builtin name, so that you can construct a Haskell tuple on the PLC side and extract it later.

Okay, that makes sense - I would expect that we always want the type that we're unlifting to as well. Can we get away with just both of those?

My guess is not, because if we call makeKnown recursively for subcomponents of the type, then we'll be wanting an evaluator with those types in its universe. So we need to be able to extend evaluators to handle a larger universe regardless.

That evaluateInCekM problem is about Evaluator Term m.

What confused me there was this sentence: "And we can't typify that when Evaluator is parametric over uni, because it has to have the same uni that we're currently in", which made me think that the problem was that we had different uni parameters that we can't unify. So I don't see what the problem is if we don't have the uni parameter.

effectfully commented 5 years ago

I would expect that we always want the type that we're unlifting to as well.

It's quite more complicated than that. For example

instance KnownType a uni => KnownType (EvaluationResult a) uni where
    toTypeAst _ = toTypeAst @a Proxy

    makeKnown EvaluationFailure     = Error () $ toTypeAst @a Proxy
    makeKnown (EvaluationSuccess x) = makeKnown x

    readKnown eval = mapDeepReflectT (fmap $ EvaluationSuccess . sequence) . readKnown eval

Here we do not add EvaluationResult a to the universe.

Same for

instance (KnownSymbol text, KnownNat uniq, uni1 ~ uni2, Evaluable uni1) =>
            KnownType (OpaqueTerm uni2 text uniq) uni1 where
    toTypeAst _ =
        TyVar () . TyName $
            Name ()
                (Text.pack $ symbolVal @text Proxy)
                (Unique . fromIntegral $ natVal @uniq Proxy)

    makeKnown = unOpaqueTerm

    readKnown (Evaluator eval) = fmap OpaqueTerm . makeRightReflectT . eval mempty

And there are instances that do not extend the current universe, but instead require it to contain something. For example

data Meta a = Meta
    { unMeta :: a
    } deriving (Show, Generic, Typeable)

instance (Evaluable uni, uni `Includes` a) => KnownType (Meta a) uni where
    toTypeAst _ = constantType @a Proxy ()

    makeKnown (Meta x) = constantTerm () x

    readKnown (Evaluator eval) term = do
        res <- makeRightReflectT $ eval mempty term
        case extractValue res of
            Just x -> pure $ Meta x
            _      -> throwError "Can't read"

However if we get rid of the KnownType machinery (which is what I'm planning to do at stage 2), we may have better luck with things being more uniform for the purposes of unlifting. Or not, I still want EvaluationResult the way it works currently (and I'm also not sure how to get rid of this particular KnownType instance, but I haven't thought about this yet).

What confused me there was this sentence: "And we can't typify that when Evaluator is parametric over uni, because it has to have the same uni that we're currently in", which made me think that the problem was that we had different uni parameters that we can't unify.

Another misuse of the parametric term, sorry. Here is how it looks, when we have Evaluator Term m:

newtype Evaluator f m = Evaluator
    { unEvaluator
        :: forall uni. Evaluable uni
        => DynamicBuiltinNameMeanings uni
        -> f TyName Name uni ()
        -> m (EvaluationResultDef uni)
    }

So we still have uni, it's just bound inside Evaluator. And when you're constructing a local Evaluator to use it in constant application machinery for unlifting of constants, that inner universally quantified uni, that you're expected to handle, can't unify with the particular global uni.

effectfully commented 5 years ago

This asks for something like Kripke semantics for evaluators. Instead of being parametric over environments or hardcoding environments we can say that any environment is fine as long as it's an extension of the original environment. But then it's not immediately clear whether it's possible to define the abstract machines in such an environment-extending way.

In fact, that seems to work. So we have

type uni <: uni' = forall a. uni a -> uni' a

newtype Evaluator f uni m = Evaluator
    { unEvaluator
        :: forall uni'. Evaluable uni'
        => uni <: uni'
        -> DynamicBuiltinNameMeanings uni'
        -> f TyName Name uni' ()
        -> m (EvaluationResultDef uni')
    }

class PrettyKnown a => KnownType a uni where
    toTypeAst :: proxy a -> Type TyName uni ()
    makeKnown :: a -> Term TyName Name uni ()
    readKnown :: Monad m => Evaluator Term uni m -> Term TyName Name uni () -> ReflectT m a

and

evaluateInCkM :: forall uni a. EvaluateConstApp uni (CkM uni) a -> CkM uni (ConstAppResult uni a)
evaluateInCkM =
    runEvaluateConstApp $ Evaluator $ \emb meansExt term -> do
        let extend means = embedDynamicBuiltinNameMeanings emb means <> meansExt
        withReader extend $ [] |> term

So evaluators receive an additional argument that allows to embed dynamic builtin names in a bigger universe than the one they are defined in. Never thought I would find some use for Kripke semantics in the real world! Works like a charm...

... except it doesn't. The type signature of embedDynamicBuiltinNameMeanings is

embedDynamicBuiltinNameMeanings
    :: uni <: uni' -> DynamicBuiltinNameMeanings uni -> DynamicBuiltinNameMeanings uni'

Should be easy to embed dynamic builtin names in a bigger universe, right? No. It amounts to embedding TypeSchemes:

embedTypeScheme :: uni <: uni' -> TypeScheme uni a r -> TypeScheme uni' a r

But TypeScheme is defined as

data TypeScheme uni a r where
    TypeSchemeResult  :: KnownType a uni => Proxy a -> TypeScheme uni a a
    TypeSchemeArrow   :: KnownType a uni => Proxy a -> TypeScheme uni b r -> TypeScheme uni (a -> b) r
    ...

and if we attempt to embed TypeScheme uni a r into TypeScheme uni' a r, we quickly realize that KnownType a uni' doesn't follow from KnownType a uni even though we know that uni' contains all the same types as uni (and possibly some additional ones). It surely doesn't hold definitionally, but even with some trickery we get

instance (Evaluable uni, KnownType a uni, euni ~ Extend b uni, Typeable b) =>
            KnownType (InExtended b uni a) euni where
    toTypeAst _ = shiftConstantsType $ toTypeAst @a @uni Proxy

    makeKnown (InExtended x) = shiftConstantsTerm $ makeKnown @a x

    readKnown (Evaluator eval) term = _

which allows to apply makeKnown to a term and inject the result into a bigger universe, but readKnown is problematic, because we need to "uninject" from that bigger universe. Which most likely should work as per

For now I'm just going to pretend we only have to deal with the extended universe and throw an error whenever something goes wrong (we already throw a lot of errors (purely) in that code, so it's not a big deal) and then we'll see whether this can be improved in stage 2.

but this entire thing is getting too convoluted.

There are two things we can do here:

finally give up on stage 1 and just directly implement stage 2. Which may or may not solve all the problems
use saturated builtins! With saturated builtins we do not need to inject arguments of a builtin just to uninject them back in the KnownType machinery and can just directly inject the entire application into a bigger universe

I'll play with the current code a little bit more (because it's nice to get something working, even if it's hacky) and then move to 1 and if it doesn't solve everything, then I'll consider implementing 2 as well.

michaelpj commented 5 years ago

What happens if we split KnowType into an encoding class and a decoding class? They should have opposite "variances" wrt the universe, and for TypeSchemes we only need to be able to decode the arguments and encode the result, so that might make things easier.

michaelpj commented 5 years ago

Hm no, that doesn't help because we'd still be "growing" the universe in both cases.

michaelpj commented 5 years ago

I think your suggestion for what to do for readKnown makes sense: you have a function read : a -> Maybe b, and you want to turn it into a function read' : a' -> Maybe b where a < a'. It makes sense that you just say "in the set a' \ a it's Nothing", i.e. "throw an error".

effectfully commented 5 years ago

Hm no, that doesn't help because we'd still be "growing" the universe in both cases.

Exactly.

I think your suggestion for what to do for readKnown makes sense: you have a function read : a -> Maybe b, and you want to turn it into a function read' : a' -> Maybe b where a < a'.

Sure, but there are technical difficulties. With

type uni <: uni' = forall a. uni a -> uni' a

newtype Evaluator f uni m = Evaluator
    { unEvaluator
        :: forall uni'. Evaluable uni'
        => uni <: uni'
        -> DynamicBuiltinNameMeanings uni'
        -> f TyName Name uni' ()
        -> m (EvaluationResultDef uni')
    }

we have no chance of figuring out how to map TypeScheme uni a r to TypeScheme uni' a r, because both uni and uni' are just variables and we can't get KnownType a uni' from KnownType a uni. Therefore our best bet here is to make evaluators receive

(forall a r. TypeScheme uni a r -> TypeScheme uni' a r)

rather than uni <: uni'. Then every time we extend the current universe, we also make sure that KnownType holds for each of the arguments and the result in the extended universe. Hence we define

shiftTypeScheme
    :: forall b uni a r. (Evaluable uni, Typeable b)
    => TypeScheme uni a r -> TypeScheme (Extend b uni) a r

However we can't get KnownType a (Extend b uni) from KnownType a uni, because normally a drives instance resolution for KnownType and thus we'd get a terrible amount of overlapping instances. So we use the instance from my previous message:

instance (Evaluable uni, KnownType a uni, euni ~ Extend b uni, Typeable b) =>
            KnownType (InExtended b uni a) euni where

where InExtended is a wrapper around a. This solves the instance search problem, but now we get KnownType (InExtended b uni a) (Extend b uni a) from KnownType a uni, i.e. we have also changed a, so we can't directly define shiftTypeScheme, because all types from the TypeScheme are now wrapped in InExtended. At this point I'm too tired to think of this nonsense, so here is the definition of shiftTypeScheme:

shiftTypeScheme (TypeSchemeResult res)       =
    unsafeCoerce $ TypeSchemeResult $ mapProxy @(InExtended b uni) res
shiftTypeScheme (TypeSchemeArrow  arg schB)  =
    unsafeCoerce $ TypeSchemeArrow (mapProxy @(InExtended b uni) arg) $ shiftTypeScheme schB
shiftTypeScheme (TypeSchemeAllType var schK) =
    TypeSchemeAllType var $ \_ -> shiftTypeScheme $ schK Proxy

i.e. we simply use unsafeCoerce to ignore the newtype wrapper problem.

We also need the unshiftEvaluator function in order to define readKnown over InExtended b uni a, which in turn requires unshiftTypeScheme, which analogously to shiftTypeScheme requires another newtype wrapper and a KnownType instance for it (dual to the one of InExtended), which in turn requires the shiftEvaluator function.

After everything is defined, we run the test

test :: EvaluationResult (Either Text ((Integer, ()), (Char, Bool)))
test = readKnownCk @_ @DefaultUni $ makeKnown ((5 :: Integer, ()), ('a', True))

and... it works! The result:

EvaluationSuccess (Right ((5,()),('a',True)))

This is awful.

michaelpj commented 5 years ago

Is the problem perhaps that our universe has too little structure? If we considered it more like a variable context in the way that e.g. James does in the metatheory, then we just want to do something like weakening, which should be more manageable because we can sensibly talk about "removing" things from our context and so on. This is very vague, though.

I'm still not sure I completely see the problem. Can't we write something like this:

restrict :: uni <: uni' -> Term TyName Name uni' a -> Maybe (Term TyName Name uni a)

which then we can then post-compose with readKnown?

effectfully commented 5 years ago

Is the problem perhaps that our universe has too little structure? If we considered it more like a variable context in the way that e.g. James does in the metatheory, then we just want to do something like weakening, which should be more manageable because we can sensibly talk about "removing" things from our context and so on. This is very vague, though.

It's a good thought, but I'm not sure what we can solve by this.

Can't we write something like this:

restrict :: uni <: uni' -> Term TyName Name uni' a -> Maybe (Term TyName Name uni a)

I don't think so
it's unrelated

We do have a function of type

unshiftConstantsTerm :: Term tyname name (Extend b uni) ann -> Term tyname name uni ann

(with Nothing being error) however the problem is not with terms -- it's with dynamic builtin names. In order to unlift a term we sometimes need to add new types to the current universe and new dynamic builtin names (see an example in this issue). And these new dynamic builtin names are allowed to reference new types. We then pass those new builtins to the evaluator, which is expected to add them to the set of buitlins that were already in scope. But they were defined in a different universe! So we need to inject old dynamic builtins names into the extended universe in order to merge them with the new builtins defined in the extended universe. And since a part of the meaning of each dynamic builtin name is a TypeScheme parameterized by a universe, this injecting amounts to embedding a TypeScheme in a bigger universe. Which is possible, but not trivial (see the previous post).

michaelpj commented 5 years ago

Okay, reading the more closely, a couple more thoughts:

Re the instance resolution issues, I wonder whether we'd be better off having KnownType be a datatype rather than a class. We're already reifying the dictionary into our data constructors a lot! Then there isn't an overlapping instances issue - we just have to explicitly construct the dictionaries we want, rather than relying on instance resolution to do the right sequence of things. I think it might be more boilerplate but fewer hacks.

we have no chance of figuring out how to map TypeScheme uni a r to TypeScheme uni' a r, because both uni and uni' are just variables and we can't get KnownType a uni' from KnownType a uni.

But we have a uni <: uni'. I think we should be able to define

embedTerm :: uni <: uni' -> Term TyName Name uni a -> Term TyName Name uni' a
restrictTerm :: uni <: uni' -> Term TyName Name uni' a -> Maybe (Term TyName Name uni a)

The second one is problematic, as you said. But I think we should be able to define <: as a "natural retract" rather than just a natural transformation, something like type f <: g = (forall a . f a -> g a, forall a . g a -> Maybe (f a)).

Doesn't this let us get from KnownType a uni to KnownType a uni'? Imagining we had explicit dictionaries (in quasi-Haskell):

extendKnownType :: uni <: uni' -> KnownType uni a -> KnownType uni' a
extendKnownType ev KnownType{makeKnown1, readKnown1} = KnownType{
    makeKnown a = embedTerm ev $ makeKnown1 a,
    readKnown eval t = case restrictTerm ev t of
        Nothing -> fail
        Just t -> readKnown1 (restrictEval ev eval) t

effectfully commented 5 years ago

Re the instance resolution issues, I wonder whether we'd be better off having KnownType be a datatype rather than a class. We're already reifying the dictionary into our data constructors a lot! Then there isn't an overlapping instances issue - we just have to explicitly construct the dictionaries we want, rather than relying on instance resolution to do the right sequence of things. I think it might be more boilerplate but fewer hacks.

It's not a hack to use a newtype to drive instance resolution. unsafeCoerce is a terrible hack, but it might be fixable, I just wanted to get something done and didn't attempt to come up with a more sensible approach (and I have an idea). But giving up on having the type class has a lot of consequences: right now we use the class only to assign meanings to dynamic builtin names and internally in evaluation engines as a way to unlift values, but once unlifting is available to the user, the user surely won't like writing explicit dictionaries instead of just specifying the type of an expression. We can define a user-facing type class, but then we have two slightly distinct type classes that do nearly the same thing, which is weird.

The second one is problematic, as you said. But I think we should be able to define <: as a "natural retract" rather than just a natural transformation, something like type f <: g = (forall a . f a -> g a, forall a . g a -> Maybe (f a)).

That would work for terms, I agree.

Doesn't this let us get from KnownType a uni to KnownType a uni'? Imagining we had explicit dictionaries (in quasi-Haskell):

extendKnownType :: uni <: uni' -> KnownType uni a -> KnownType uni' a
extendKnownType ev KnownType{makeKnown1, readKnown1} = KnownType{
    makeKnown a = embedTerm ev $ makeKnown1 a,
    readKnown eval t = case restrictTerm ev t of
        Nothing -> fail
        Just t -> readKnown1 (restrictEval ev eval) t

restrictEval amounts to restrictTypeScheme, which amounts to restrictKnownType, which amounts to embedEval, which amounts to embedTypeScheme, which amounts to embedKnownType, which amounts to restrictEval. I'm not sure this is a true loop, but it seems to be a loop to me. I break out of this loop by specifying how to embed/restrict eval/TypeScheme/KnownType each time the universe gets extended, i.e. I use a non-looping bottom-up approach.

If that was the only problem, I would just go ahead and try to fix the unsafeCoerce issue. Other things I've described above are basically fine. But we have a bigger task of figuring out whether we want to implicitly unlift values during evaluation or not. Right now we do that and we had to do that previously, but with extensible universes we have an alternative. Instead of defining TypeScheme as

data TypeScheme uni a r where
    TypeSchemeResult  :: KnownType a uni => Proxy a -> TypeScheme uni a a
    TypeSchemeArrow   :: KnownType a uni => Proxy a -> TypeScheme uni b r -> TypeScheme uni (a -> b) r

we can define it as

data TypeScheme uni a r where
    TypeSchemeResult  :: uni `Includes` a => Proxy a -> TypeScheme uni a a
    TypeSchemeArrow   :: uni `Includes` a => Proxy a -> TypeScheme uni b r -> TypeScheme uni (a -> b) r

I.e. instead of requiring each type to be liftable and unliftable we can require each type to be in the current universe. Then the burden of turning a term into an unliftable one is on the user, but the user can just use functions or a type class that we'll provide (i.e. we probably can make exactly the same API as with internal unlifting). This allows to disentangle evaluation from unlifting, which greatly simplifies the former (e.g. we no longer need to get KnownType a uni' from KnownType). But what about polymorphic builtin types like Sealed, how do we put them into universes? First instantiate them, so that the type becomes monomorphic? But is the situation with internal unlifting any better in this regard? We also have some quite specific KnownType instances like the ones for EvaluationResult and OpaqueTerm and it's not immediately clear how to handle them with the external unlifting approach. Do we maybe need implicit conversions between Haskell and PLC types (which is what we have currently)? I don't think so, but it's another thing to consider.

So it's just the case that the design space is huge and I'm exploring the options.

michaelpj commented 5 years ago

It's not a hack to use a newtype to drive instance resolution.

I don't know, I always find it a bit obscure. The nice thing about functions that take dictionaries is that it's back in the realm of Just Functions and Data. I don't look forward to trying to remember what the exciting IsExtended instance does.

We can define a user-facing type class, but then we have two slightly distinct type classes that do nearly the same thing, which is weird.

I actually think this could be fine.

data KnownTypeDict uni = ...
class KnownType uni a where
    dict :: KnownTypeDict uni a
    -- other methods defined in terms of dict if we want

A bit of boilerplate yes, but if it makes the internals easier to understand I'll take it.

restrictEval amounts to restrictTypeScheme, which amounts to restrictKnownType, which amounts to embedEval, which amounts to embedTypeScheme, which amounts to embedKnownType, which amounts to restrictEval.

:scream_cat:

I can't hold this in my head, but I see what you're saying :/

I.e. instead of requiring each type to be liftable and unliftable we can require each type to be in the current universe.

Interesting. That does sound like it would simplify things a lot. I can see that quantified types are awkward. Maybe this is a place where thinking of the universe as a type context would be less weird - it could be sensible to add type variables to it. But this is all confusing me greatly tbh.

effectfully commented 5 years ago

I don't know, I always find it a bit obscure. The nice thing about functions that take dictionaries is that it's back in the realm of Just Functions and Data. I don't look forward to trying to remember what the exciting IsExtended instance does.

I see your point and partially agree with it, but you'd still have to have a function that plays the role of the IsExtended newtype wrapper and its KnownType instance.

On the other hand this:

metaCommaMeaning :: NameMeaning (Extend (a, b) uni)
metaCommaMeaning = NameMeaning sch def where
    sch =
        Proxy @(InExtended (a, b) uni a) `TypeSchemeArrow`
        Proxy @(InExtended (a, b) uni b) `TypeSchemeArrow`
        TypeSchemeResult (Proxy @(AsExtension uni (a, b)))
    def (InExtended x) (InExtended y) = AsExtension (x, y)

does look a little bit too much.

Maybe this is a place where thinking of the universe as a type context would be less weird - it could be sensible to add type variables to it.

Currently the all binder does not change the indices of a TypeScheme. You might think this is weird, but I would rather not mimic target language binders at the type-level in Haskell. That would work in Agda, but in Haskell it's just a huge pain. What we have currently seems to work pretty well, is convenient to use and is not too unsafe (you have to get indices of type variables right, but that's a really small cost to pay for not reflecting any binders at the type level).

But this is all confusing me greatly tbh.

Yeah, I've been working on this for quite a while and it's still intimidating.

effectfully commented 5 years ago

I updated TypeScheme to the version that we talked about in slack:

data TypeScheme uni as r where
    TypeSchemeResult  :: KnownType a uni => Proxy a -> TypeScheme uni '[] a
    TypeSchemeArrow   :: KnownType a uni => Proxy a -> TypeScheme uni as r -> TypeScheme uni (a ': as) r

type family FoldType as r where
    FoldType '[]       r = r
    FoldType (a ': as) r = a -> FoldType as r

data NameMeaning uni = forall as r. NameMeaning (TypeScheme uni as r) (FoldType as r)

(except I decided to keep the r type variable for several reasons. Hence as is the list of types of arguments of a builtin). It doesn't really seem to affect anything, except makes things slightly nicer.

Renamed DynamicBuiltinNameMeaning to just NameMeaning. Probably will perform some more renaming later.

Solved the unsafeCoerce problem by embedding the entire NameMeaning instead of just its TypeScheme. I.e. during the traversal of the TypeScheme we also adapt the associated Haskell function by wrapping/unwrapping arguments/result. This piece of ASCII art does that:

shiftNameMeaning
    :: forall b uni. (Evaluable uni, Typeable b)
    => NameMeaning uni -> NameMeaning (Extend b uni)
shiftNameMeaning (NameMeaning sch x) =
    go sch $ \sch' inj -> NameMeaning sch' $ inj x where
        go
            :: TypeScheme uni as r
            -> (forall as' r'.
                    TypeScheme (Extend b uni) as' r' -> (FoldType as r -> FoldType as' r') -> c)
            -> c
        go (TypeSchemeResult _)         k =
            k (TypeSchemeResult Proxy) $ InExtended @b
        go (TypeSchemeArrow _ schB)     k =
            go schB $ \schB' inj -> k
                (TypeSchemeArrow Proxy schB')
                (\f -> inj . f . unInExtended @b)
        go (TypeSchemeAllType var schK) k =
            go (schK Proxy) $ \schB' inj -> k (TypeSchemeAllType var $ \_ -> schB') inj

So it seems we now have an MVP, because unlifting of products works and we do not unlift elements separately, but use Haskell's (,) embedded into a PLC term instead.

michaelpj commented 5 years ago

So it seems we now have an MVP, because unlifting of products works and we do not unlift elements separately, but use Haskell's (,) embedded into a PLC term instead.

Amazing!

effectfully commented 5 years ago

But what about polymorphic builtin types like Sealed, how do we put them into universes? First instantiate them, so that the type becomes monomorphic?

Well, we can do the usual thing:

data ExampleUni a where
    ExampleUniInt  :: ExampleUni Int
    ExampleUniList :: ExampleUni a -> ExampleUni [a]

@michaelpj, note that with universes like that we can't use simply a list of types as an index of a data type like we do in Agda.

But defining instances over SomeOf ExampleUni is not that trivial. I elaborated on the Everywhere gadget above, but in our case we get:

instance Closed ExampleUni where
    type ExampleUni `Everywhere` constr = (constr Int, ???)
    ...

What's in ???? We are generic over constr and thus we can't just hardcode whatever constraint might seem appropriate for a particular type class. E.g. for Eq that would be forall a. Eq a => Eq [a] and for an imaginary type class Length, that allows to get the length of a data structure, it would be forall a. Length [a]. So between distinct type classes and distinct data structures there are all kinds of possible behaviors and thus we need an additional type class. We can define

class HoldsAt f uni constr where
    bringAt :: Proxy f -> proxy constr -> uni a -> (constr (f a) => r) -> r

instance HoldsAt [] ExampleUni Eq where
    bringAt _ _     ExampleUniInt        r = r
    bringAt p proxy (ExampleUniList uni) r = bringAt p proxy uni r

instance Closed ExampleUni where
    type ExampleUni `Everywhere` constr = (constr Int, HoldsAt [] ExampleUni constr)
    bring _     ExampleUniInt        r = r
    bring proxy (ExampleUniList uni) r = bringAt (Proxy @[]) proxy uni r

Here HoldsAt specifies the behavior of a type class for a data structure in a particular universe. I.e. for every universe for every type class for every data structure we need to define an instance. This a cubic amount of instances, which is clearly too much.

What we really want to say is something like that:

instance Closed ExampleUni where
    type ExampleUni `Everywhere` constr = (constr Int, forall a. uni `Includes` a => constr [a])
    ...

which reads as "a constraint over a type from the ExampleUni universe is satisfied whenever the Int type satisfies it and for each type a from the universe the [a] type satisfies the constraint". However recall that we bring the instance explicitly into the scope using the bring function and so

forall a. uni `Includes` a => constr [a]

is "morally" true, but we can't expect GHC to realize that constr [a] holds just because a is in some universe. So we need an additional type class:

class Through (uni :: * -> *) constr a where
    bringStruct :: proxy1 constr -> proxy2 (uni a) -> (constr a => c) -> c

instance (uni `Includes` a, Closed uni, uni `Everywhere` Eq) => Through uni Eq [a] where
    bringStruct pConstr (_ :: proxy2 (uni [a])) r = bring pConstr (knownUni @uni @a) r

Here we say that the Eq instance for [a] can be derived whenever a is a type from some universe and all types from that universe implement Eq (which is a pretty sensible thing to say). Here is another example:

instance (uni `Includes` a, uni `Includes` b, Closed uni, uni `Everywhere` Eq) =>
            Through uni Eq (a, b) where
    bringStruct pConstr (_ :: proxy2 (uni (a, b))) r =
        bring pConstr (knownUni @uni @a) $ bring pConstr (knownUni @uni @b) r

Now that we are generic over universes, we can return to the Closed instance:

instance Closed ExampleUni where
    type ExampleUni `Everywhere` constr = (constr Int, forall a. uni `Includes` a => Through uni constr [a])
    ...

which reads as before, but now we're not attempting to construct constr [a] directly, but instead promise that an instance like that can be brought in scope anytime through the uni universe. It almost works, but GHC is unhappy about the quantified constraint in the associated type family, which we can fix by making a class synonym:

class (forall a. uni `Includes` a => Through uni constr (f a)) => TypeCons uni constr f
instance (forall a. uni `Includes` a => Through uni constr (f a)) => TypeCons uni constr f

instance Closed ExampleUni where
    type ExampleUni `Everywhere` constr = (constr Int, TypeCons ExampleUni constr [])
    bring _       ExampleUniInt                          r = r
    bring pConstr (ExampleUniList (uni :: ExampleUni a)) r =
        reflect uni $ bringStruct pConstr (Proxy @(ExampleUni [a])) r

This finally works. We only need to show what that reflect is:

class Reflect uni where
    reflect :: uni a -> (uni `Includes` a => c) -> c

It's needed because we can't use uni a directly in a quantified constraint and have to make it an actual constraint, so we do that. The type class essentially enforces that every type from a universe is included in that universe.

Tests work:

instance Reflect ExampleUni where
    reflect ExampleUniInt        r = r
    reflect (ExampleUniList uni) r = reflect uni r

instance ExampleUni `Includes` Int where
    knownUni = ExampleUniInt

instance ExampleUni `Includes` a => ExampleUni `Includes` [a] where
    knownUni = ExampleUniList knownUni

instance GEq ExampleUni where
    ExampleUniInt       `geq` ExampleUniInt       = Just Refl
    ExampleUniList uni1 `geq` ExampleUniList uni2 = do
        Refl <- uni1 `geq` uni2
        Just Refl
    _                   `geq` _ = Nothing

test1 = SomeOf ExampleUniInt 5 == SomeOf ExampleUniInt 6
test2 = SomeOf (ExampleUniList ExampleUniInt) [5, 5] == SomeOf (ExampleUniList ExampleUniInt) [5, 5]

This probably can be generalized even more, because those two instances

instance (uni `Includes` a, Closed uni, uni `Everywhere` Eq) => Through uni Eq [a]
instance (uni `Includes` a, Closed uni, uni `Everywhere` Show) => Through uni Show [a]

are basically the same. In fact, we can be sloppy and define


instance ( uni `Includes` a
         , Closed uni, uni `Everywhere` constr
         , forall b. constr b => constr [b]
         ) => Through uni constr [a] where
    bringStruct pConstr (_ :: proxy2 (uni [a])) r = bring pConstr (knownUni @uni @a) r

and that will cover all already existing constraints over [a], including Eq and Show.

effectfully commented 5 years ago

But what about polymorphic builtin types like Sealed, how do we put them into universes?

As shown above the problems with polymorphic builtin types are mostly technical and related to getting constraints right. However for newtype wrappers like Sealed we can just ignore these problems. Whatever holds for any type a from a universe, also holds for Sealed a, because it's just a wrapper around a. And therefore we can just ignore Sealed when it comes to Everywhere and friends.

We also have some quite specific KnownType instances like the ones for EvaluationResult and OpaqueTerm and it's not immediately clear how to handle them with the external unlifting approach

Here is how:

data TypeGround uni a where
    TypeGroundValue  :: uni a -> TypeGround uni a
    TypeGroundResult :: uni a -> TypeGround uni (EvaluationResult a)
    TypeGroundTerm   :: TypeGround uni (OpaqueTerm uni text uniq)

data TypeScheme uni as r where
    TypeSchemeResult  :: TypeGround uni a -> TypeScheme uni '[] a
    TypeSchemeArrow   :: TypeGround uni a -> TypeScheme uni as r -> TypeScheme uni (a ': as) r
    TypeSchemeAllType
        :: (KnownSymbol text, KnownNat uniq)
        => Proxy '(text, uniq)
        -> (forall ot. ot ~ OpaqueTerm uni text uniq => TypeGround uni ot -> TypeScheme uni as r)
        -> TypeScheme uni as r

So we just hardcode those special types into ground values of TypeSchemes. Some additional gymnastics is needed to get shifting/unshifting. Basically, we have to change TypeGroundTerm to

    TypeGroundTerm
        :: (forall a. uni' a -> uni a)
        -> (forall a. uni a -> uni' a)
        -> TypeGround uni (OpaqueTerm uni' text uniq)

and then we can write things like

unextend :: Extend b uni a -> uni a
unextend (Original uni) = uni
unextend Extension      = error "Can't unshift an 'Extension'"

shiftTypeGround :: TypeGround uni a -> TypeGround (Extend b uni) a
shiftTypeGround (TypeGroundValue  uni)  = TypeGroundValue  $ Original uni
shiftTypeGround (TypeGroundResult uni)  = TypeGroundResult $ Original uni
shiftTypeGround (TypeGroundTerm inj ej) = TypeGroundTerm (Original . inj) (ej . unextend)

(I'll replace error by Nothing at some point).

And that plays well with the constant application machinery:

makeBuiltin :: TypeGround uni a -> a -> ConstAppResult uni (Term TyName Name uni ())
makeBuiltin (TypeGroundValue  uni) x                     = pure . Constant () $ SomeOf uni x
makeBuiltin (TypeGroundResult _  ) EvaluationFailure     = ConstAppFailure
makeBuiltin (TypeGroundResult uni) (EvaluationSuccess x) = pure . Constant () $ SomeOf uni x
makeBuiltin (TypeGroundTerm inj _) (OpaqueTerm term)     = pure $ substConstantsTerm inj term

readBuiltin
    :: GEq uni => TypeGround uni a -> Term TyName Name uni () -> ConstAppResult uni a
readBuiltin (TypeGroundTerm  _ ej) value   = pure . OpaqueTerm $ substConstantsTerm ej value
readBuiltin (TypeGroundResult _  ) Error{} = pure EvaluationFailure
readBuiltin (TypeGroundResult uni) value   = EvaluationSuccess <$> unliftConstant uni value
readBuiltin (TypeGroundValue  uni) value   = unliftConstant uni value

The only tricky part here is handling embedding/unembedding opaque terms into/from universes (and I'm not confident I got it right). We no longer need to be parametric over some Monad m, because we no longer pass evaluators around and thus do not need to account for the fact that evaluators can be monadic. I.e. this design makes the constant application machinery perfectly pure.

In fact, we do not even need to have a closed data type (TypeGround) and instead can just have a type class as before, but this time without passing an evaluator to readKnown, because it's not needed to implement the necessary instances. But I'll stick to the data type for now, we can always add a type class on top of it to get a nice API where the term-level part of a TypeScheme is derived from its type arguments.

I also realized that the previous approach with a single type class used for both internal and external purposes was a dangerous thing. It would be fine for someone to provide their own KnownType instance (since the user is given access to this class), but then they can affect how evaluation of a Plutus Core term proceeds, which is a security hole. With a single internal data type not exported to the user, there is no such pitfall.

I haven't yet got something working, but things seem to make sense.

effectfully commented 5 years ago

As shown above the problems with polymorphic builtin types are mostly technical and related to getting constraints right. However for newtype wrappers like Sealed we can just ignore these problems. Whatever holds for any type a from a universe, also holds for Sealed a, because it's just a wrapper around a. And therefore we can just ignore Sealed when it comes to Everywhere and friends.

Haven't finished my thought. So basically we can get away with hardcoding Sealed without implementing a full machinery for handling polymorphic types. And they are not required for unlifting as we anyway only need to unlift monomorphic values (right?) (like [(Bool, Integer)]). So I think to get something done it's best to avoid implementing full support for polymorphic values and to cut some corners here and there for the time being.

Dealing with polymorphic types seems to be more or less orthogonal to the core of the approach, so we can add polymorphism later.

effectfully commented 5 years ago

I.e. instead of requiring each type to be liftable and unliftable we can require each type to be in the current universe. Then the burden of turning a term into an unliftable one is on the user, but the user can just use functions or a type class that we'll provide (i.e. we probably can make exactly the same API as with internal unlifting).

Unfortunately, this is troubling. Consider unlifting of products: we have a PLC's pair integer bool and need to get a Haskell's (Integer, Bool) from it. With unlifting that we have currently, the entire structure is unlifted at once: we do not need to require neither Integer nor Bool to be in the current universe. We just extend the universe with a single type, (Integer, Bool), and unlift directly into it. But with explicit unlifting we'd have to unlift Integer and Bool separately and then combine the resulting values together into a tuple. But this means that we have to add not only (Integer, Bool) to the universe, but also both Integer and Bool.

And from what I see this is not only the problem of unlifting into specialized polymorphic types. For a type of trees with integers in leafs, we'll have to either require integers to be in the current universe or to hardcode the logic of "unlift an integer and immediately wrap it as a leaf, so that there is no need to add integers to the current universe". This monorphic case might be solvable, but the polymorphic one does look irritating.

And note that adding several new types to the current universe is not something that can be done easily. In (a, b) the types, a and b, are independent of each other and so we want to unlift them separately, but that means that we extend the universe in two distinct incompatible ways: with the a type to unlift into a and with the b type to unlift into b. And then we need to unify those two differently extended universes, which is hard technically. Constraints might sometimes help in cases like that, but see the problems with Includes and Extend described above.

In general, whenever the user wants to unlift a data structure, they do not really care about pieces of the data structure. "Just do the right thing".

Except, as described above, there are now two ways to embed Haskell values: deeply into a Scott-encoded data structure and shallowly as a Constant wrapper around a Haskell value. And therefore we can unlift from a deeply or shallowly embedded value. So the user might actually want to specify which way to unlift values to use in each particular situation.

effectfully commented 5 years ago

Also, how should we unlift, say, tuples with manual unlifting? Is it

unlift . PLC.mapTuple unlift unlift

Haskell.mapTuple unlift unlift . unlift

or something else? The more I think about manual unlifting, the less I like the idea.

effectfully commented 5 years ago

I started implementing manual unlifting just to check it out, but then...

But with explicit unlifting we'd have to unlift Integer and Bool separately and then combine the resulting values together into a tuple. But this means that we have to add not only (Integer, Bool) to the universe, but also both Integer and Bool.

... for unlifting ([(Integer, Bool)], ByteString) we need to add

Integer
Bool
(Integer, Bool)
[(Integer, Bool)]
([(Integer, Bool)], ByteString)

to the universe. Automatically. While I believe this is doable, I don't think doing this is a good idea. I should just give up on the idea of manual unlifting in the form that I've been thinking of it and consider other opportunities.

effectfully commented 4 years ago

The current plan is to implement an MVP (see the latest comment about it after which I started doing research on whether there is a better option). The PR is #1596.

effectfully commented 4 years ago

Another problem we have is that with this machinery in place we can embed values in a shallow (just wrap a Haskell value in the Pure constructor) and a deep (convert a Haskell list to a PLC list for example -- we already do that) ways and so we need a way to distinguish between the two ways to embed values.

I also realized that the previous approach with a single type class used for both internal and external purposes was a dangerous thing. It would be fine for someone to provide their own KnownType instance (since the user is given access to this class), but then they can affect how evaluation of a Plutus Core term proceeds, which is a security hole. With a single internal data type not exported to the user, there is no such pitfall.

There seems to be a surprisingly convenient way of solving these problems: we extend the KnownType class with a type family that returns a Visibility:

data Visibility
    = Internal
    | External

class PrettyKnown a => KnownType a uni where
    type VisibilityOf a :: Visibility
    type VisibilityOf a = 'External

    <...>

and require in the constant application machinery that the Visibility of a must be Internal (so that the user can't extend evaluation with bogus rules). Then every deep conversion between Haskell and PLC must be wrapped in the Deep newtype and every shallow embedding of Haskell into PLC must be wrapped in Shallow. Looks like this:

newtype Shallow a = Shallow
    { unShallow :: a
    }

newtype Deep a = Deep
    { unDeep :: a
    }

instance (Evaluable uni, uni `Includes` a, PrettyKnown a) => KnownType (Shallow a) uni where
    type VisibilityOf (Shallow a) = 'Internal

    toTypeAst _ = constantType @a Proxy ()

    makeKnown (Shallow x) = constantTerm () x

    readKnown (Evaluator eval) term = do
        res <- makeRightReflectT $ eval id mempty term
        case extractValue res of
            Just x -> pure $ Shallow x
            _      -> throwError "Not an integer-encoded Char"

-- The @VisibilityOf a ~ 'External@ constraint is not really needed, I suppose,
-- but semantically it seems better to have it.
instance (KnownType a uni, VisibilityOf a ~ 'External) => KnownType (Deep a) uni where
    type VisibilityOf (Deep a) = 'Internal
    toTypeAst = <...>
    makeKnown = makeKnown . unDeep
    readKnown eval = fmap Deep . readKnown eval

So we have a single Shallow instance once and for all: any Haskell value can be embedded into PLC as long as its type is in the universe. While the deep embedding is specific to a particular type, e.g.:

instance Evaluable uni => KnownType Bool uni where
    toTypeAst _ = bool

    makeKnown b = if b then true else false

    readKnown (Evaluator eval) term = <...>

Notice that this VisibilityOf machinery is invisible to people writing "unlifting" instances like this one.

Name meanings then look like this:

typedLessThanInteger
    :: Evaluable uni => TypedBuiltinName uni '[Shallow Integer, Shallow Integer] (Deep Bool)

Here we explicitly specify how to embed values: shallowly or deeply. Now we don't need to guess how to unlift a type -- it's all in the type signatures.

And at the same time for unlifting Deep is the default and only Shallow needs to be mentioned explicitly, e.g.:

test :: EvaluationResult (Either Text ((Shallow Integer, ()), (Char, Bool)))
test = readKnownCk @_ @DefaultUni $ makeKnown ((Shallow (5 :: Integer), ()), ('a', True))

However if we ignore the security argument then shallow and deep unlifting can be distinguished by whether a type is wrapped in Shallow or not. If not, then it's a deep embedding. And no Visibility is needed (as well as the type family).

I've tried the former approach, it seems to work well, but I'll implement the latter, because it's easy to extend it to the former and I don't want to complicate right now what is already complicated.

effectfully commented 4 years ago

I was bored, so I started thinking about this again. The MVP that I was going to implement was solving two problems at once: the extensible builtins problem and the unlifting one. But

we can first add extensible builtins and only then deal with unlifting. The former is much simpler than the latter, but requires updating almost the entire codebase, so it makes a lot of sense to make a separate PR out of it
we may not want to incorporate full-fledged unlifting into the constant application machinery (which is what I was planning to do). There is a sweet spot between "no unlifting during evaluation at all" (which would prevent as from having builtins that receive boolean arguments, although what is wrong with not using Church-encoded Bool and just adding Bool to the default universe?) and "any unlifting during evaluation" (which requires adding new types to the universe during evaluation, which is what causes all the problems) and that sweet spot is "only allow unlifting that doesn't require adding new types to the universe" (i.e. this is what we have now for the hardcoded universe). This way we still can lift and unlift things like () and Bool (even though it's a bit hacky), all the internal machinery (KnownType instances for EvaluationResult, OpaqueTerm, etc) works the same way it works currently (which is good).

However it's not clear whether we can untangle evaluation from unlifting. I tried that previously and failed, but there might be alternative approaches. In any case, first adding extensible builtins without breaking anything and only then considering various approaches to the harder problem of unlifting seems like a good plan. It's conservative and a huge step forward at the same time.

effectfully commented 4 years ago

that sweet spot is "only allow unlifting that doesn't require adding new types to the universe"

After several hours of thinking it struck me that we can extend universes in a different way. Instead of using

data Extend b uni a where
    Extension :: Extend b uni b
    Original  :: uni a -> Extend b uni a

we can make universes higher-order: uni :: (* -> *) -> * -> *, where the first argument of uni is an extension of it (i.e. each uni has a constructor with the following type: ext a -> uni ext a) and the last one remains the same. This way we can define all built-in names in the forall ext. uni ext universe and that will allow us later to extend this universe by instantiating ext to some particular subuniverse without disturbing any built-in names that are already in scope, because their types explicitly say that any ext is fine.

This means that we will be able to add new types to the universe during unlifting, but we won't be able to add new built-in names that reference those types. So we still have a restriction, but it's relaxed now and unlifting of things like () and Bool becomes much nicer.

Having that ext parameter is also good, because we can try to make it unordered (the unordered-effects blog post that I wrote is a direct consequence of this research on dynamic builtins), since we always extend universes with a single type at a time (no type constructors, etc). While generally universes encode infinite number of types (e.g. Integer, [Integer], [[Integer]]...), so trying to make those unordered is completely hopeless.

Anyway, we should start with first-order universes to get extensible builtins. Making universes higher-order later shouldn't pose any problems.

By the way, we currently do support dynamic extension of the set of built-in names during unlifting, but we don't actually need it. We needed it earlier and we will need it in future, but right now this is a feature that is not used anywhere, I think.