Use a binding library to implement scopes in Eval.hs

jwiegley commented 6 years ago

Right now we do name lookups in a hash maps, which means that for x: x + x + x, there will be three lookups of the same argument. With unbound-generics we can perform the substitution by using a subst function on expressions, which will probably mean introducing two phases of evaluation: one that uses an F-algebra NExprF NExpr -> NExpr simply to reduce the underlying lambda calculus by performing binds and applications, and then the current evaluator for performing the remaining computations.

jwiegley commented 6 years ago

A nice introductory article to the concept: https://byorgey.wordpress.com/2011/03/28/binders-unbound/

The reimplementation of that idea using GHC Generics: http://hackage.haskell.org/package/unbound-generics

Pinging @lambdageek, whose presence is missed when Theo and I are discussing lambda calculii. :)

lambdageek commented 6 years ago

I think you can make an instance like
```
instance Alpha r => Alpha (NExprF r)
```
I don't have an Alpha instance for Fix, so you'll need an orphan (possibly a bit intricate because of the higher kind).
You will need to decide what your variables stand for - part of the design of Unbound is that you have to specify a phantom type to Unbound.Generics.LocallyNameless.Name a:
```
type Var = Unbound.Generics.LocallyNameless.Name (NExprF NExpr)
```
That's maybe unsatisfying. I've never tried Var a = Name (NExprF a), so if that seems more appealing, I can't guarantee it'll work but I'd love to hear if you can make it work.
General advice: Unbound has a distinction between terms and patterns (in Bind p t, p is a pattern and contains structures where occurrences of Name represent binding positions for variables; t is a term - occurrences of Name represent variable uses) - in general I would make most of my AST structures as patterns (so use Rebind, Embed, Shift and Rec generously) - it's somehow a better structure. Possibly because unbind :: Fresh m => Bind p t -> m (p, t) is effectful, whereas taking patterns apart is pure.
Bug reports welcome! I intend to keep working on unbound-generics.

lambdageek commented 6 years ago

Also if your binding situation is simple, I wouldn't discount bound. It's pretty great. Especially if you love polymorphic recursion and you only have a single sort of variable.

In a lot of ways unbound-generics is more appropriate for languages with exotic binding structure:

either multiple different sorts of variables, or
variables that stand for different syntactic categories than where they can occur, or
variables that are more like abstract symbols which are subject to alpha conversion but don't necessarily have a notion of substitution. (Alpha-renaming is mandatory and free for Unbound Names, but substitution is optional!)
Really involved scoping rules (the original motivation for Weirich and Yorgey was telescopes in dependent type systems).

jwiegley commented 6 years ago

@lambdageek Is there any performance difference between them? Will bound let us easily handle recursive bindings?

There are also cases in Nix where you can take an "attribute set" (basically just a Map), and convert the value into a binding scope using a with keyword. I'm wondering how to do that with these binding libraries. Here's an example:

let f = pkgs: 
    let y = 10; 
    in
        with pkgs; 
            [ y x ]; 
in
    f { x = 100; }

What Nix requires here is that the attribute set that's passed as an argument to f, and becomes the value bound to pkgs for that call, is non-strictly transformed into the dynamic scope made visible by with only if lookup fails, meaning that when the lookup of y is done, pkgs is left unevaluated, but when the lookup for x is performed and fails, then and only then does pkgs get forced in order to establish a scope in which x can be resolved.

lambdageek commented 6 years ago

@jwiegley

bound pays more attention to performance.

Unbound becomes quadratic in the binding depth if you repeatedly unbind terms and descened under binders: https://github.com/lambdageek/unbound-generics/blob/master/benchmarks/BenchLam.hs

The solution is to implement explicit substitutions (which is essentially bound's trick), but I never did that work and it would require explicit cooperation from user ASTs.
I don't know how recursive bindings work in bound. I think you have to do something adhoc.
with is a problem. It's basically like open in Standard ML or import Foo; in Haskell, but worse because of your crazy dynamic scoping semantics. As an author of a variable binding package I can tell you authoritatively that scoping constructs that bind some names but don't tell you which ones suuuuuuuuuuuuuck.

What Leroy mentions in "A modular module system" is that they do a separate "scoping" pass before they move from their surface syntax to their workaday AST in order to figure out what names mean.

Here's an outline where the internal AST only has to work with statically determined scoping (ie import Foo (a,b,..., z) instead of import Foo;)
```
module AST where
   -- uses unbound-generics for variables and binding scopes

  -- pattern
  data Decl = ... 
     | FunDecl (Name Exp) (Embed Exp) -- ^  fun f = e
     | SimpleImport (Embed ModuleSpec) ![Name Exp]
       -- ^  import Foo (a, b, ... z)
  -- term
  data Module = Module (Bind (Rec [Decl]) ())
      -- ^ module where all the declarations are mutually recursive 
```
```
module SurfaceSyntax where
 -- uses Text for variables and leaves binding scope implicit
  data Decl = ...
     | FunDecl Text Exp
     | SimpleImport ...
     | EvilImport ModuleSpec  -- ^ import Foo (*)
  data Module = Module [Decl]
```
You basically go parse :: Text -> MyParsingMonad SurfaceSyntax.Module and resolveNames :: SurfaceSyntax.Module -> P m AST.Module where P is something like
```
  type ImportResult = Either ImportFileError AST.Module
  type P m r = Pipes.Proxy ImportRequest ImportResult () Pipes.X m r
```
In other words: while you're resolving the names in some surface syntax, you can pause with some ImportRequest and wait to be fed an AST (corresponding to the content of that import) so you can pull out the names that it will provide and then resume resolving. (This example is from https://github.com/lambdageek/insomnia-dev/blob/master/src/Insomnia/SurfaceSyntax/ToastMonad.hs#L112)

In the case of with it sounds like you will actually need to suspend all the way down in evaluation just to figure out what your syntax is.

jwiegley commented 6 years ago

I’m no longer convinced that this issue is necessary, for two reason: We want NExpr to be a “raw” expression tree, for purposes of pretty printing, where it makes no sense to do name resolution. And due to the dynamic nature of Nix, some name lookups will have to be deferred until evaluation anyway. Bound would give us faster substitution when calling into lambdas, but we have the case of attribute sets to deal with, where we can’t use Nats for indexing.

So I’m closing until a compelling reason arises.

jwiegley commented 6 years ago

Nope, it's necessary, otherwise we re-evaluate expensive function bodies with each new set of arguments.

Ericson2314 commented 5 years ago

I saw https://github.com/clash-lang/clash-compiler/pull/361 which is disappointing. I desperately want one of these libraries to win, and end the duplication we have today.

christiaanb commented 5 years ago

@Ericson2314 Clash might've been one of the worst use-cases with regards to performance of unbound because it does sooooo many traversals of an expression, going under and over binders all the time.

Given that all of my options to improve Clash' run-time situation basically meant I had to (probably) change nearly every module in the compiler, I went with the option that seemed to be the safest performance-wise: implement the system GHC uses. My particular use case was:

A LOT of traversals
Relatively few substitutions (compared to the number of traversals)
Very few alpha equivalence
Names and locations of the the user source code need to be preserved

Doing my own implementation has been causing loads of shadowing/capture/free variable issues though! And implementing it has taken me over 5 weeks now; and even though it passes the test suite, I'm quite sure there are still bugs.

So if you're not constantly/repeatedly traversing deeply-nested binding structures, I don't think there's much of a performance penalty when using unbound-generics; I would definitely recommend it.

haskell-nix / hnix

Use a binding library to implement scopes in Eval.hs #106