Closed namin closed 3 years ago
This is really cool. I will look closer and play with it.
Meanwhile, I think this could use:
The last might need some general mechanism we can reuse, as we start writing more library code.
I need to add support for ?
in symbols in the Rust implementation. I really want to see if this runs there (as it should).
You could always replace ? with p :) I am curious if it runs as well.
I replace ?
with -p
and $
with x
.
Unfortunately, I got an error:
:load ../lurk/example/micro-tests.lurk
Reading from ../lurk/example/micro-tests.lurk.
thread 'main' panicked at 'Unbound variable: Sym("UNIFY")', src/eval.rs:236:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
I suspect this is related to what you fixed in api.lisp
(#4), although that example did not cause a problem. I'll try to see how to apply the changes.
I think getting lang.lisp
fully compatible with eval.rs
(and aspirationally equivalent to api.lisp
) will help us keep up.
Then a test suite that checks all this automatically so we can keep everything in sync once we get there.
I may divert to this next week.
A few things:
unify
is defined after its use in ==
, but I still get similar errors (with other unbound variables) when I change everything to be defined before use.
'Unbound variable: Sym("U")', src/eval.rs:236:17
lang.lisp
, I get the right thing if I re-order all letrec*
bindings to be defined before use. (I also had to define atom
.)lang.lisp
, besides adding a big test there? Can we have a lang.lisp
REPL?So:
lang.lisp
version.lang.lisp
version to support the same language as the other versions.Thanks.
A few things:
unify
is defined after its use in==
, but I still get similar errors (with other unbound variables) when I change everything to be defined before use.
Oh… then maybe the work in #3 and #4 changed letrec*
into letrec
in api.lisp
. I think for now, we want bindings sequential to keep the surface area of initial implementation small.
'Unbound variable: Sym("U")', src/eval.rs:236:17
- in
lang.lisp
, I get the right thing if I re-order allletrec*
bindings to be defined before use. (I also had to defineatom
.)
Okay, good. That might mean that lang.lisp
is correct. We should check #2 against lang.lisp
.
- What's the easiest way to test with
lang.lisp
, besides adding a big test there? Can we have alang.lisp
REPL?
We should have REPL, yes.
So:
- It seems like the Rust version has bigger problems with bindings than the
lang.lisp
version.- We should bring the
lang.lisp
version to support the same language as the other versions.
Yes. I agree 100%. We need to get all three in sync — with eval.rs
and lang.lisp
becoming mechanical translations of each other, with same data structures and control logic. Then it should be reasonably easy to:
api.lisp
,lang.lisp
,eval.rs
,circuit.rs
.
- Ideally, we want to have a test suite of lurk programs that we can test in all versions. I guess that's what you want to diver to next week.
Yes, I think maybe the way to do this is through the REPL interface. We can define some scripting directives and make sure all three REPLs support a similar-enough interface that they can be used non-interactively to run tests. Then we can write test suites and run them against all three standalone engines.
It's probably worth doing that first, since it will simplify the process of bringing everything together.
Longer term, we can probably bring the existing unit tests (which also check iteration counts) into that framework. This will be a good way of further ensuring lang.lisp
and eval.rs
are implementing the same execution semantics.
As a side note, we should add support for comments in the Rust REPL, so we can add them to the .lurk
files.
Also, is there a reason not to make the language a subset of Scheme?
Also, is there a reason not to make the language a subset of Scheme?
Not necessarily. I've considered exactly that. I think there are some design decisions that should be deferred, though.
One rhetorical answer to the question is to note that there is also not a strong reason not to make it a subset of Common Lisp.
It's currently a sort of hybrid, though I'm open to most things changing once we get a first minimal version working end-to-end (meaning all the way to working proofs).
Given that there are good arguments for Scheme-like and CL-like features, and given that some things will need to be dictated by the domain (support for field elements and things that are economical in constraints, etc.) — the safest assumption is still that the 'right thing' might continue to be a (carefully considered) hybrid.
An example of something I'm currently leaning toward being Scheme-like: explicit booleans #t
and #f
. Not only will that simplify things like microKanren (as you noted), but it fits well with the needed type system for the underlying constraint logic. So we could have a boolean type tag, with the requirement that associated values be either 0 or 1. This constraint can be enforced when boolean types are used.
But… from a circuit programming perspective, we will want to be able to perform arithmetic operations on these 'bits' — for example to decompose field elements (or constrained integers) into their constituents and vice versa. We can't exactly do that with Scheme booleans. We could define the functions that perform these operations to use booleans, but it's not entirely obvious that we won't want to be more general in defining how arithmetic works with booleans. Likewise with numbers more generally: the way to handle types will need to fit into what makes sense for the underlying circuits/fields; and it's not obviously the case that this will be quite either a Scheme or Common Lisp 'subset'.
Another consideration is macros. I think we will want low-level support for macros, and CL-style defmacro
will be much simpler to support. But I don't think this plays well with a Lisp-1 (especially) without package namespaces.
Anyway, that's a medium-length answer to the question. I'm open to the possibility but not yet convinced we won't paint ourselves into a corner if we commit to it. Therefore, for now, I'm explicitly seeking something simple, minimal, and likely subject to change so we can defer design decisions we will be better equipped to answer farther down the road.
On the rust side, I managed to reduce the unbound variable u
to this example:
(letrec*
(
(not
(lambda (x) (if x nil t)))
(pairp
(lambda (x) (not (atom x))))
(unify
(lambda (u v s)
(if (if (pairp v) nil nil)
v
u)))
)
(unify 2 3 0))
Inlining pairp
does not cause an error.
Oh good, that should help.
Even simpler
(letrec*
(
(id
(lambda (x) x))
(id2
(lambda (x) (id x)))
(foo
(lambda (u)
(if (id2 0)
u
0)))
)
(foo 0))
I "fixed" it by commenting out line 859: //Continuation::Tail(env, c) => *c.clone(),
That does look like it might be at least one of the divergences that crept in. I'll try to reconcile everything but might not be until early next week. Thank you for digging.
I fixed it for real! Will send a pull request.
Now, microKanren almost runs in lurk-rs. We need to implement zero-arg functions.
Fantastic. I think zero-arg functions are already implemented in lang.lisp
— so if you want to take a crack at putting in eval.rs
, feel free to PR that also.
OK, I'll take on the challenge of implementing zero-arg functions in eval.rs
-- not sure exactly when I'll find the time, but let me reserve the problem.
With the thunk change in rust, the microKanren tests now run. It's slow though :P
> ((== 5 5) empty-state)
((== 5 5) empty-state)
[469 iterations] => (((((VAR DUMMY))) . Fr(0x)))
> ((call/fresh (lambda (q) (== q 5))) empty-state)
((call/fresh (lambda (q) (== q 5))) empty-state)
[690 iterations] => (((((VAR . Fr(0x)) . Fr(0x5)) ((VAR DUMMY))) . Fr(0x1)))
> (take-all (a-and-b empty-state))
(take-all (a-and-b empty-state))
[3111 iterations] => (((((VAR . Fr(0x1)) . Fr(0x5)) ((VAR . Fr(0x)) . Fr(0x7)) ((VAR DUMMY))) . Fr(0x2)) ((((VAR . Fr(0x1)) . Fr(0x6)) ((VAR . Fr(0x)) . Fr(0x7)) ((VAR DUMMY))) . Fr(0x2)))
> (take 1 ((call/fresh (lambda (q) (fives q))) empty-state))
(take 1 ((call/fresh (lambda (q) (fives q))) empty-state))
[1066 iterations] => (((((VAR . Fr(0x)) . Fr(0x5)) ((VAR DUMMY))) . Fr(0x1)))
> (car (((appendo '(a) '(b) '(a b)) empty-state)))
(car (((appendo '(a) '(b) '(a b)) empty-state)))
[9031 iterations] => ((((VAR . Fr(0x2)) B) ((VAR . Fr(0x1))) ((VAR . Fr(0x)) . A) ((VAR DUMMY))) . Fr(0x3))
> (car (((appendo2 '(a) '(b) '(a b)) empty-state)))
(car (((appendo2 '(a) '(b) '(a b)) empty-state)))
[9050 iterations] => ((((VAR . Fr(0x2)) B) ((VAR . Fr(0x1))) ((VAR . Fr(0x)) . A) ((VAR DUMMY))) . Fr(0x3))
> (take 2 (call-appendo empty-state))
(take 2 (call-appendo empty-state))
[12913 iterations] => (((((VAR . Fr(0x)) (VAR . Fr(0x1)) (VAR . Fr(0x2)) (VAR . Fr(0x3))) ((VAR . Fr(0x2)) VAR . Fr(0x3)) ((VAR . Fr(0x1))) ((VAR DUMMY))) . Fr(0x4)) ((((VAR . Fr(0x)) (VAR . Fr(0x1)) (VAR . Fr(0x2)) (VAR . Fr(0x3))) ((VAR . Fr(0x2)) VAR . Fr(0x6)) ((VAR . Fr(0x5))) ((VAR . Fr(0x3)) (VAR . Fr(0x4)) VAR . Fr(0x6)) ((VAR . Fr(0x1)) (VAR . Fr(0x4)) VAR . Fr(0x5)) ((VAR DUMMY))) . Fr(0x7)))
> (take 2 (call-appendo2 empty-state))
(take 2 (call-appendo2 empty-state))
[11842 iterations] => (((((VAR . Fr(0x)) (VAR . Fr(0x1)) (VAR . Fr(0x2)) (VAR . Fr(0x3))) ((VAR . Fr(0x2)) VAR . Fr(0x3)) ((VAR . Fr(0x1))) ((VAR DUMMY))) . Fr(0x4)) ((((VAR . Fr(0x)) (VAR . Fr(0x1)) (VAR . Fr(0x2)) (VAR . Fr(0x3))) ((VAR . Fr(0x3)) (VAR . Fr(0x4)) VAR . Fr(0x6)) ((VAR . Fr(0x2)) VAR . Fr(0x6)) ((VAR . Fr(0x5))) ((VAR . Fr(0x1)) (VAR . Fr(0x4)) VAR . Fr(0x5)) ((VAR DUMMY))) . Fr(0x7)))
>
Yeah, the Rust implementation is deceptively slow — mostly, I think, because it does a lot of unnecessary expensive hashing. The high iteration counts… yes, that will be something to work on. Compilation should help.
@namin I think now that this works end-to-end, we can merge. Would you update to a version (with changed symbol names) that runs in lurk-rs
— and rebase on master? I see you have a working example, so I assume it's just not yet committed.
updated & rebased
Two work-arounds:
The representation for variables is
(var <id>)
. Not ideal in the long term.Because of the mixup between nil and #f, the substitution map has a default dummy entry.