minchingtonak / lambster

Lambda calculus interpreter written in TypeScript
https://lambster.dev
25 stars 0 forks source link

Error: Maximum recursion depth exceeded while reducing term. #17

Open tromp opened 1 year ago

tromp commented 1 year ago

I'd like to reduce

(\b0.(\b1.(\cons.(\cons0.(\fix.(\s0.(\ssucc.(\sieve.(\primes.(\2.(\3.(\4.(\64.(\tk.(\someprimes.(\main.main) someprimes) (primes (64 tk (b0 (b0 b1))))) (\cont.\x.\xs.\z.z x (xs cont))) 4) (2 2)) (\f.\x.f (f (f x)))) (\f.\x.f (f x))) (cons0 (cons0 (sieve s0)))) ((\f.(\x.x x) (\x.f (x x))) (\sieve.\sn.cons b1 ((\ssn.sieve ssn (fix ssn)) (ssucc sn))))) (\sn.\c.\x.\xs.cons x (xs (sn c)))) (\cont.\x.\xs.cons0 (xs cont))) (\f.(\x.x x) (\x.f (x x)))) (\y.\z.z b0 y)) (\x.\y.\z.z x y)) (\x.\y.y)) (\x.\y.x)

which results in normal form \z.z (\x\y.x) (\z.z (\x\y.x) (\z.z (\x\y.y) (\z.z (\x\y.y) (\x\y.y))))

What recursion depth do you need to reduce that and how much larger is it than the limit you set?

RockBrentwood commented 7 months ago

I checked it with Combo (on my page), which uses optimal sub-term sharing. Even with sharing, it's still 269 steps with up to 16 shared sub-terms per expression - the actual expression size can go exponential in the number of shared sub-terms. It's probably going to break any design that doesn't use optimal sub-term sharing.

That means the constructors in the software will have to do term-hashing lookup, to avoid being overloaded.

Combo does so and compiles everything into combinators. Both your lambda-terms evaluate to _0 (_0 (_2 (_2 _1))), where _0 = C (T K), _1 = K I and _2 = C (T _1), with your "normal form" lambda term compiling directly into it.

Another example worth looking at is M M x, where M = S (S S) S and S = \x y z.x z (y z). Its reduction is finite. On the other hand, M M C, where C = \x y z.x z y, is neither finite nor rational, i.e. infinite number of distinct sub-terms will eventually appear in the normal order reduction sequence.

tromp commented 7 months ago

The expression is the prime number sieve from https://github.com/tromp/AIT/blob/master/characteristic_sequences/primes.lam limited to only the first 4 bits of output "0011", since 0 and 1 are not prime while 2 and 3 are. This string is represented as cons 0 (cons 0 (cons 1 (cons 1 nil))), with 0 = K = \x0\x1.x0, and 1 = nil = SK = \x0\x1.x1 Removing the limit on output length yields

(\B0.(\B1.(\cons.(\cons0.(\Y.(\S0.(\Ssucc.(\sieve.(\primes.(\2.(\3.(\4.(\12.(\4k.(\tk.(\someprimes.(\main.main) primes) (primes (4k tk (B0 (B0 B1))))) (\cont.\x.\xs.\z.z x (xs cont))) (12 2)) (\f.3 (4 f))) (2 2)) (\f.\x.f (f (f x)))) (\f.\x.f (f x))) (cons0 (cons0 (sieve S0)))) ((\f.(\x.x x) (\x.f (x x))) (\sieve.\Sn.cons B1 ((\Ssn.sieve Ssn (Y Ssn)) (Ssucc Sn))))) (\Sn.\c.\x.\xs.cons x (xs (Sn c)))) (\cont.\x.\xs.cons0 (xs cont))) (\f.(\x.x x) (\x.f (x x)))) (\y.\z.z B0 y)) (\x.\y.\z.z x y)) (\x.\y.y)) (\x.\y.x)

which computes an infinitely long list of bits 00110101000101000101000100000101000001000101000100000100...

I don't think term-hashing lookup is needed to evaluate this efficiently; my own combinator based BLC interpreter at https://github.com/tromp/AIT/blob/master/uni.c doesn't hash but can output 65536 bits in a few minutes. The other implementations at

https://rosettacode.org/wiki/Universal_Lambda_Machine

don't use hashing either and output hundreds or thousands of bits. They do all use sharing, either in a Krivine machine implementation, or by the built-in evaluation of anonymous lambda functions.

RockBrentwood commented 7 months ago

Sharing ... whatever way. Combo sorts them out with hashing ... actually a kind of homomorphic hashing. That doesn't exclude other ways.

An probable error in your expression: "... (\someprimes.(\main.main) primes) ..." should be "... (\someprimes.(\main.main) someprimes) ...". otherwise "someprimes", "3", "4", "12", "4k" and "tk" would all become superfluous in your expression.

Also, you may not have noticed that you can also factor out the embedded "Y": "... (sieve S0)))) ((\f.(\x.x x) (\x.f (x x))) (\sieve. ..." can be rewritten as "... (sieve S0)))) (Y (\sieve. ...".

An equivalent expression in a cleaner syntax that Combo handles (and that helps you better see and correct the errors and catch the oversights) would be: _0 = \x y.x, _1 = \x y.y, _2 = \f x.f (f x), _3 = \f x.f (f (f x)), _4 = _2 _2, Join = \x y z.z x y, Join0 = \y z.z _0 y, Y = \f.(\x.x x) (\x.f (x x)), s0 = \cont x xs.Join0 (xs cont), ssucc = \sn c x xs.Join x (xs (sn c)), sieve = Y (\sieve sn.ssn = ssucc sn, Join _1 (sieve ssn (Y ssn))), primes = Join0 (Join0 (sieve s0)), tk = \cont x xs z.z x (xs cont), _12 = \f._3 (_4 f), _4096 = _12 _2, Limit = _4096, primes (Limit tk (_0 (_0 _1)))

tromp commented 7 months ago

The expression in https://github.com/tromp/AIT/blob/master/characteristic_sequences/primes.lam

does have some unused bindings, but they get automatically filtered out by the optimizer in my blc tool (as built by "make blc"). They were there for the convenience of truncating the final output.

The optimizer stage comes after the stage that translates to de Bruijn indices, so it wasn't performed when I asked it to print the term with original identifiers. Sorry for the confusion.

RockBrentwood commented 7 months ago

Do you have statistics (time, step count) for different settings of the "Limit" variable I alluded to, above? And, perhaps, also of the term constructor counts? This will provide a better perspective on the issue you raised here. For step count, on Combo, I get a 4-fold increase with each increment to the power of 2 in "Limit". It will begin to thrash, on my machines, at about 8th or 9th power of 2, since Combo uses no GC but keeps everything for perpetual reuse in order to fulfill the promise that "nothing is ever reduced twice".

tromp commented 7 months ago

I get these rewrite counts for various output sizes:

16 steps 1117 time 0ms steps/s 666M #GC 0 HP 2554 64 steps 15104 time 0ms steps/s 666M #GC 0 HP 25962 256 steps 232358 time 2ms steps/s 116M #GC 0 HP 394890 1024 steps 3681537 time 26ms steps/s 141M #GC 6 HP 47808 4096 steps 58780372 time 339ms steps/s 173M #GC 98 HP 727428 16384 steps 940874900 time 5689ms steps/s 165M #GC 1727 HP 923312 65536 steps 2228173302 time 125937ms steps/s 17M #GC 31309 HP 1096424

The other fields are real time, rewrites per second, number of garbage collection passes, and heap space in use at the end.