Closed dougalm closed 1 year ago
Is it worth code-genning code that writes to an in-memory string buffer rather than literally prints? Pro: more flexible on the user side. Con: Presumably requires O(1) vector push. Or would it be easy to refactor to that after?
Changing away from AtomSubstVal will presumably be pretty big, since those substitutions are used all over place, aren't they? There are presumably also other substitution types, like the one in the inliner (which carries expressions, that either have or have not been inlined into themselves). And I want to introduce a "suspended lambda" constructor to substitutions that can be carried by the simplifier, to make its performance not be quadratic. We should make sure the new plan is generic enough to allow all of those use-cases.
+1 to the small changes. Though I've also been reading about the uniqueness type system in Futhark, and thinking that maybe we want to borrow from there. Such borrowing may have interesting consequences for references and types thereof.
Is it worth code-genning code that writes to an in-memory string buffer rather than literally prints?
Yes, I agree that would be better but we need the O(1) stack as you say. I'm planning to directly write for now but I think it'll be easy enough to change the implementation underneath once we have a stack.
Changing away from AtomSubstVal will presumably be pretty big, since those substitutions are used all over place, aren't they?
Yes. It's a big change, but a very mechanical one at least. Good point about making sure that we keep enough flexibility for these other use cases.
Superseded by #1209
The basic approach seems to work but I ran into two big problems with this patch. It looks like it's going to be a long slog to work around them and get the tests passing. Instead, I'm going to fix them on main and then apply this patch on top of that.
Problem 1: printing
Printing tables from the Haskell runtime has always been a mess. We rely on our half-baked interpreter to figure out the index set sizes and we call into Imp to calculate offsets and interpret the low-level representation of data. But with this patch, calling back into Imp becomes harder because Imp doesn't handle the rich types that we want to be able to print. Getting the printer working as before will mean reimplementing a lot of the logic in Imp, including Algebra!
But there's a better solution: codegen printing and print from within the Dex runtime instead of Haskell. That also gives us debug-printing for free, which is a badly missing feature. Eventually we want to use
Show
or similar so that we can have user-defined formatting. But I propose that initially we use a built-in code generator that can print any Dex value (with fallback strings like<function>
for some cases). I don't think it'll even be especially hard to implement. It's just writing the equivalent of thePretty (Atom r n)
instance using theBuilder
monad, on top of a couple of primitives likeprintStringLit :: String -> Builder CoreIR n ()
andprintBaseTypeValue :: String -> Atom n -> Builder CoreIR n ()
.Doing this will cause one regression: we'll have to print lambda terms like
<function: Nat -> Nat>
rather than\x:Nat. x
. But that's how it works in most languages anyway. If you want to see the internals of a function for debugging, there's always%passes
.Printing was the main reason we've had to keep Interpreter and Serialize around. They've always been buggy and hard to maintain. Once we do printing from the Dex runtime we can delete them both! (This will regress one feature: we won't be able to have table literals with non-Fin index sets. While it's a nice feature to have, I don't think it's worth the cost of maintaining the interpreter. You can always achieve the same thing with a runtime-checked coercion.)
2. Problem 2: ProjectElt
We introduced
ProjectElt
to allow projections to appear in types. That lets us gives a type to the second component of a dependent pair without unpacking the whole pair into two binders at once. We made it n-ary and made its rhs aVar
instead of anAtom
in order to syntactically enforce normalization -- we don't want something unreduced likefst (1,2)
to appear in a type. But there are two problems withProjectElt
as it is today:It's too inflexible to allow new language features. In the current patch, we needed a way to project
SimpInCore
atoms, but they're not variables so they can't appear inProjectElt
as it stands. So I had to giveSimpInCore
its own list of projections. Actually, it needed two of them! (One before the coercion and one after.) When we allow application in types, which we want for associated types, we'll have the same problem.Naive substitution can denormalize so we need to re-normalize when we substitute. But our current substitution framework doesn't carry the environment we need to query types. So we're forced to do a purely syntactic normalization. We've scraped by so far but with these latest features it's no longer possible. Our
RepVal
atom constructor that we use in Imp translation had to have its own list of projections because there wasn't enough information to unwrap the newtype.Eventually I think we want to have a different ANF constraint for atoms in type position. ANF makes it easy to share work but hard to test equality. But for types we want the opposite: "Anti-ANF" -- no decls but arbitrarily nested expressions -- would forbid work-sharing but make equality checking easy.
Before we go that far I think we can manage if we (1) make the RHS of
ProjectElt
anAtom
(andProjectElt
can then be unary) and (2) implement substitution-with-atoms in a way that lets us carry the environment. Purely-renaming substitution can still be done syntactically. So instead of the parametricSubstE v
withv = Name
andv = AtomSubstVal
, we'll just have two classes,SubstE
andAtomSubstE
, where only the latter requires an environment.Other small changes that will make this patch easier to land
AppTopFun (TopFunName n) [Atom r n]
case toAtom
so that we can avoid havingApp
and function-typedVar
in SimpIR.HeapType
type andHeapVal
value for the heap parameter instead of havingh:Type
(leftover from ST monad inspiration) so that SimpIR doesn't have to have variables of typeType
. (This patch already does this but we can separate it out into an earlier patch.)