Open jwiegley opened 6 years ago
A nice introductory article to the concept: https://byorgey.wordpress.com/2011/03/28/binders-unbound/
The reimplementation of that idea using GHC Generics: http://hackage.haskell.org/package/unbound-generics
Pinging @lambdageek, whose presence is missed when Theo and I are discussing lambda calculii. :)
I think you can make an instance like
instance Alpha r => Alpha (NExprF r)
I don't have an Alpha
instance for Fix
, so you'll need an orphan (possibly a bit intricate because of the higher kind).
You will need to decide what your variables stand for - part of the design of Unbound is that you have to specify a phantom type to Unbound.Generics.LocallyNameless.Name a
:
type Var = Unbound.Generics.LocallyNameless.Name (NExprF NExpr)
That's maybe unsatisfying. I've never tried Var a = Name (NExprF a)
, so if that seems more appealing, I can't guarantee it'll work but I'd love to hear if you can make it work.
General advice: Unbound has a distinction between terms and patterns (in Bind p t
, p
is a pattern and contains structures where occurrences of Name
represent binding positions for variables; t
is a term - occurrences of Name
represent variable uses) - in general I would make most of my AST structures as patterns (so use Rebind
, Embed
, Shift
and Rec
generously) - it's somehow a better structure. Possibly because unbind :: Fresh m => Bind p t -> m (p, t)
is effectful, whereas taking patterns apart is pure.
Bug reports welcome! I intend to keep working on unbound-generics
.
Also if your binding situation is simple, I wouldn't discount bound
. It's pretty great. Especially if you love polymorphic recursion and you only have a single sort of variable.
In a lot of ways unbound-generics
is more appropriate for languages with exotic binding structure:
Name
s, but substitution is optional!)@lambdageek Is there any performance difference between them? Will bound let us easily handle recursive bindings?
There are also cases in Nix where you can take an "attribute set" (basically just a Map), and convert the value into a binding scope using a with
keyword. I'm wondering how to do that with these binding libraries. Here's an example:
let f = pkgs:
let y = 10;
in
with pkgs;
[ y x ];
in
f { x = 100; }
What Nix requires here is that the attribute set that's passed as an argument to f
, and becomes the value bound to pkgs
for that call, is non-strictly transformed into the dynamic scope made visible by with
only if lookup fails, meaning that when the lookup of y
is done, pkgs
is left unevaluated, but when the lookup for x
is performed and fails, then and only then does pkgs
get forced in order to establish a scope in which x
can be resolved.
@jwiegley
bound pays more attention to performance.
Unbound becomes quadratic in the binding depth if you repeatedly unbind terms and descened under binders: https://github.com/lambdageek/unbound-generics/blob/master/benchmarks/BenchLam.hs
The solution is to implement explicit substitutions (which is essentially bound's trick), but I never did that work and it would require explicit cooperation from user ASTs.
I don't know how recursive bindings work in bound. I think you have to do something adhoc.
with
is a problem. It's basically like open
in Standard ML or import Foo;
in Haskell, but worse because of your crazy dynamic scoping semantics. As an author of a variable binding package I can tell you authoritatively that scoping constructs that bind some names but don't tell you which ones suuuuuuuuuuuuuck.
What Leroy mentions in "A modular module system" is that they do a separate "scoping" pass before they move from their surface syntax to their workaday AST in order to figure out what names mean.
Here's an outline where the internal AST only has to work with statically determined scoping (ie import Foo (a,b,..., z)
instead of import Foo;
)
module AST where
-- uses unbound-generics for variables and binding scopes
-- pattern
data Decl = ...
| FunDecl (Name Exp) (Embed Exp) -- ^ fun f = e
| SimpleImport (Embed ModuleSpec) ![Name Exp]
-- ^ import Foo (a, b, ... z)
-- term
data Module = Module (Bind (Rec [Decl]) ())
-- ^ module where all the declarations are mutually recursive
module SurfaceSyntax where
-- uses Text for variables and leaves binding scope implicit
data Decl = ...
| FunDecl Text Exp
| SimpleImport ...
| EvilImport ModuleSpec -- ^ import Foo (*)
data Module = Module [Decl]
You basically go parse :: Text -> MyParsingMonad SurfaceSyntax.Module
and resolveNames :: SurfaceSyntax.Module -> P m AST.Module
where P
is something like
type ImportResult = Either ImportFileError AST.Module
type P m r = Pipes.Proxy ImportRequest ImportResult () Pipes.X m r
In other words: while you're resolving the names in some surface syntax, you can pause with some ImportRequest
and wait to be fed an AST (corresponding to the content of that import) so you can pull out the names that it will provide and then resume resolving. (This example is from https://github.com/lambdageek/insomnia-dev/blob/master/src/Insomnia/SurfaceSyntax/ToastMonad.hs#L112)
In the case of with
it sounds like you will actually need to suspend all the way down in evaluation just to figure out what your syntax is.
I’m no longer convinced that this issue is necessary, for two reason: We want NExpr to be a “raw” expression tree, for purposes of pretty printing, where it makes no sense to do name resolution. And due to the dynamic nature of Nix, some name lookups will have to be deferred until evaluation anyway. Bound would give us faster substitution when calling into lambdas, but we have the case of attribute sets to deal with, where we can’t use Nats for indexing.
So I’m closing until a compelling reason arises.
Nope, it's necessary, otherwise we re-evaluate expensive function bodies with each new set of arguments.
I saw https://github.com/clash-lang/clash-compiler/pull/361 which is disappointing. I desperately want one of these libraries to win, and end the duplication we have today.
@Ericson2314 Clash might've been one of the worst use-cases with regards to performance of unbound
because it does sooooo many traversals of an expression, going under and over binders all the time.
Given that all of my options to improve Clash' run-time situation basically meant I had to (probably) change nearly every module in the compiler, I went with the option that seemed to be the safest performance-wise: implement the system GHC uses. My particular use case was:
Doing my own implementation has been causing loads of shadowing/capture/free variable issues though! And implementing it has taken me over 5 weeks now; and even though it passes the test suite, I'm quite sure there are still bugs.
So if you're not constantly/repeatedly traversing deeply-nested binding structures, I don't think there's much of a performance penalty when using unbound-generics
; I would definitely recommend it.
Right now we do name lookups in a hash maps, which means that for x: x + x + x, there will be three lookups of the same argument. With unbound-generics we can perform the substitution by using a
subst
function on expressions, which will probably mean introducing two phases of evaluation: one that uses an F-algebraNExprF NExpr -> NExpr
simply to reduce the underlying lambda calculus by performing binds and applications, and then the current evaluator for performing the remaining computations.