kevinlawler / kona

Open-source implementation of the K programming language
ISC License
1.36k stars 138 forks source link

early binding #537

Closed tavmem closed 4 years ago

tavmem commented 5 years ago

This issue was documented by @bakul as part of issue #521 and in some other issue. Making it a separate issue so we don't lose track of it again:

The following snippet will behave differently in k3 and kona:

f:{:[x>0;2*f[x-1];1]} g:f f:{0} g 4

In processing lambda expression on RHS, kona replaces the inner f with self reference. This changes the definition of g to a recursive one so redefining f later has no effect on g and g 4 yields 16. Not so in k3, where g 4 will yield 0. I alluded to this early binding issue in an old bug report (don’t recall which one now). To see the difference, compare the contents of .k for both interpreters after each definition. Do the same for mutually recursive functions.

tavmem commented 5 years ago
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  f:{:[x>0;2*f[x-1];1]}
  g:f
  f:{0}
  g 4
16
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems 
\ for help. \\ to exit.

  f:{:[x>0;2*f[x-1];1]}
  g:f
  f:{0}
  g 4
0
bakul commented 5 years ago

Easier to what happens this way:

  foo:{:[x>0;2*foo[x-1];1]}
  g:foo
  g

k3:

{:[x>0;2+foo[x-1];1]}

kona:

{:[x>0;2+_f[x-1];1]}

So instead of g calling foo within its body, here g calls itself.

tavmem commented 5 years ago

Thanks ... Kona uses the "_f" strategy as a workaround for some difficult memory leak issues. It is easy enough to disable the workaround.

When disabling the workaround and only running these 6 tests in src/test.c

Z I tests02()
{
  TC(16,foo:{:[x>0;2*foo[x-1];1]}; f 4)        //passes
  TC(16,foo:{:[x>0;2*_f[x-1];1]}; f 4)         //passes
  TC(6, fac:{[n] :[n=0;1;n*fac n-1]}; fac 3)    //fails - memory leak
  TC(6, fac:{[n] :[n=0;1;n*_f n-1]}; fac 3)     //passes
  TC(6, fac:{:[x>1;x*fac[x-1];1]}; fac 3)       //passes
  TC(6, fac:{:[x>1;x*fac x-1;1]}; fac 3)        //fails - memory leak
  R 0;

get the following resuts:

Failed: Memory Leak - 6, fac:{[n] :[n=0;1;n*fac n-1]}; fac 3 
Allocated K: 319
Unfreed K  : 23
Leak %     : 0.072100
...
Failed: Memory Leak - 6, fac:{:[x>1;x*fac x-1;1]}; fac 3 
Allocated K: 304
Unfreed K  : 11
Leak %     : 0.036184
...
Test pass rate: 0.6667, Total: 6, Passed: 4, Skipped: 0, Failed: 2, Time: 0.003503s
fail

Looks like it's time to "bite the bullet" and figure out where the memory leaks arise, and get rid of the "_f" workaround.

Maybe, install my own copy of valgrind. (I'm using Fedora now, instead of Gentoo. It should be a lot easier to install.)

tavmem commented 5 years ago

Valgrind was not that helprul:

$ valgrind --leak-check=full ./k_test </dev/null
==3596== Memcheck, a memory error detector
==3596== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3596== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==3596== Command: ./k_test
==3596== 
...
Test pass rate: 0.6667, Total: 6, Passed: 4, Skipped: 0, Failed: 2, Time: 0.182103s
fail
  ==3806== 
==3806== HEAP SUMMARY:
==3806==     in use at exit: 272 bytes in 1 blocks
==3806==   total heap usage: 688 allocs, 687 frees, 31,916 bytes allocated
==3806== 
==3806== 272 bytes in 1 blocks are possibly lost in loss record 1 of 1
==3806==    at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==3806==    by 0x40122C1: allocate_dtv (in /usr/lib64/ld-2.29.so)
==3806==    by 0x4012C31: _dl_allocate_tls (in /usr/lib64/ld-2.29.so)
==3806==    by 0x49B413E: pthread_create@@GLIBC_2.2.5 (in /usr/lib64/libpthread-2.29.so)
==3806==    by 0x41FCF4: attend (kc.c:414)
==3806==    by 0x4486D0: main (main.c:8)
==3806== 
==3806== LEAK SUMMARY:
==3806==    definitely lost: 0 bytes in 0 blocks
==3806==    indirectly lost: 0 bytes in 0 blocks
==3806==      possibly lost: 272 bytes in 1 blocks
==3806==    still reachable: 0 bytes in 0 blocks
==3806==         suppressed: 0 bytes in 0 blocks
==3806== 
==3806== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
tavmem commented 5 years ago

Just for documentation: To disable the "_f" workaround:

diff --git a/src/k.c b/src/k.c
index dade916..f0ed0ed 100644
--- a/src/k.c
+++ b/src/k.c
@@ -49,9 +49,10 @@ I miN(I a,I b){R a<b?a:b;}

 K X(S s){kerr("(nil)"); fnci=0; R XN(s,strlen(s));}
 Z K XN(S s,I n){  //asserts ex(x) has first-line U(x)
-  S newS=recur(s);
-  if(newS) {K r=ex(wd(newS,strlen(newS))); free(newS); R r;}
-  else  R ex(wd(s,n));
+  //S newS=recur(s);
+  //if(newS) {K r=ex(wd(newS,strlen(newS))); free(newS); R r;}
+  //else  R ex(wd(s,n));
+  R ex(wd(s,n));
 }
 K KX(K x){R XN(CSK(x),xn);}  //assumes 3==ABS(xt)

diff --git a/src/kc.c b/src/kc.c
index b9b2d00..ff1a3ae 100644
--- a/src/kc.c
+++ b/src/kc.c
@@ -266,7 +266,7 @@ I line(FILE*f, S*a, I*n, PDA*p) {  //just starting or just executed: *a=*n=*p=0,
   if(n && '\n'==(*a)[*n-1]) (*a)[--*n]=0;   //chop for getline

   trim(*a); //remove leading blanks
-  S newA=recur(*a); if(newA){ free(*a); *a=newA; }  //check & fix 'Named Recursion' (issue #288)
+  //S newA=recur(*a); if(newA){ free(*a); *a=newA; }  //check & fix 'Named Recursion' (issue #288)
   *n=strlen(*a); //strlen might have been changed in 'trim' or in 'recur'
   if((*a)[0]=='\\')fbs=1; else fbs=0;
tavmem commented 5 years ago

I'm surprised that the "_f" workaround lasted as long as it did. It dates back to Jan 30, 2015 as a fix for crash issue #288 in commit 328648398577fc18246e8c65f5bb799aaf7da2b5.

tavmem commented 4 years ago

The original case for crash issue #288, now works fine in Kona when the "_f" workaround is disabled, and it does not cause a memory leak.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  r:{:[x;x,r[x-1];0]}; r[4]
4 3 2 1 0
tavmem commented 4 years ago

However, both of these do cause a memory leak in this form (no bracket around x-1):

  TC(4 3 2 1 0, r:{:[x;x,r x-1;0]}; r[4] )
  TC(16, foo:{:[x>0;2*foo x-1;1]}; foo 4)

but, this one does not cause a memory leak

  TC(16, foo:{:[x>0;2* _f x-1 ;1]}; foo 4)
bakul commented 4 years ago

I tried running Ackerman's function to see if there are any leaks (and to see function call overhead without very deep nesting)...

  c:0
  ack:{c+:1;:[x;:[y;_f[x-1;_f[x;y-1]];_f[x-1;1]];y+1]}
  ack[3;5]

This is significantly slower than k3 and uses much more memory. The ps command reveals RSS of 2240KB & 77ms for k3 and 37348KB & 25281ms for kona on FreeBSD-x86. For ack[3;6] k3 RS is 2336KB and 314 ms, but kona ran out of stack. Note that stack depth is only 510 frames for ack[3;6] as can be measured by the following code under k3. Note that even on an amd64 machine kona runs out of stack (on amd64 machine \w shows max used of 206MB).

  c:0;d:0
  ack:{:[d<z;d+:1];c+:1;:[x;:[y;_f[x-1;_f[x;y-1;z+1];z+1];_f[x-1;1;z+1]];y+1]}
  \t ack[3;6;0]

Looks like the stack error may be a separate issue but there is either a memory leak or the function call overhead (in time as well as space) is quite high.