kevinlawler / kona

Open-source implementation of the K programming language
ISC License
1.36k stars 138 forks source link

Strange interaction of a local variable with the way a function is called #575

Closed tavmem closed 4 years ago

tavmem commented 4 years ago

This is quite a contrived example, but it illustrates a bug. Consider the following 4 commands:

l:!12                           /a list of 12 integers
s:{l[(4*x)+!4]}                 /a subset of 4 of them
m:{l[(4*x)+!4]:10*a:s[x]}       /multiply by 10 (use local "a" for no good reason)
r:{i:0; while[i<3; m i; i+:1]}  /run that for all 3 subsets

if you execute "r"

  r 0
8 9 10 11

8 9 10 11

8 9 10 11

you get the last value of the local variable "a" Every time you hit "enter", you get it again.

Changing the way "m" is called

r:{i:0; while[i<3; m[i]; i+:1]}    /run that for all 3 subsets

eliminates the problem. In both cases, local "a" is not in the symbol table

  \v
`l `s `m `r

Also, in both cases, "r" does modify the global list correctly

  l
0 10 20 30 40 50 60 70 80 90 100 110
tavmem commented 4 years ago

A simpler case that also exhibits that same problem

kona      \ for help. \\ to exit.

  m:{a:(!12)[x]}
  r:{m x}
  r 5
5

5

5

After executing "r", you get the result again each time you hit "enter".

tavmem commented 4 years ago

Even simpler (remove the "enumerate"):

kona      \ for help. \\ to exit.

  m:{a:(0 1 2 3 4)[x]}
  r:{m x}
  r 2
2

2

2
bakul commented 4 years ago

Note that in k3 assignment doesn't produce a value at the top level in a function, unless it is used in an expression:

  m:{a:12}
  m[]
  n:{+a:12}
  n[]
12

In case this has a bearing on what you're observing.

tavmem commented 4 years ago

Thanks ! ... you've just documented another case where kona differs from k3 (and k2.8)

kona      \ for help. \\ to exit.

  m:{a:12}
  m[]
12
tavmem commented 4 years ago

Getting back to the first case ... you can simplify further and eliminate the parentheses and brackets ... and get the same error

kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}
  r 2
2

2

2
tavmem commented 4 years ago

The first case is a regression.

It works fine in the commit 566c9e1f48584821741b27d9ad8ff06b8239ebf3 titled "better for ARM processors and 32-bit Android" made on Jan 20, 2016.

and is broken (gives no result at all) in commit 43b5962366f215038d012cdaf665c44fcafec01c titled "fix #353: trial implementation of print suppression" also made on Jan 20, 2016.

tavmem commented 4 years ago

However, the current behavior comes later.

Beginning with commit 1440f28bd769cc4492960da2b22a1989f56d0a04 titled "handle 'x_' case" made on Mar 30, 2016, we get

  m:{a:0 1 2 3 4 x}
  r:{m x}
  r 2
value error

Then with commit aac5b17bc2763104169a389dc5806a4b26e3a86b titled "fix #423: 'value error' for any defined variable" made on Apr 3, 2016 we get no result again.

Beginning with commit 86f82d39ed17d25ea1782c195d372f545c4a3cbb titled " fix #427 and fix #428" made on Apr 11, 2016 even though you get no result, an additional hit to "Enter" produces the result. Hitting "Enter" again, gives the result repeatedly

  m:{a:0 1 2 3 4 x}
  r:{m x}
  r 2

2

2

2

Finally, beginning with commit abc4b58dc61a950d3b8b6b9a48362aa9a711c788 titled "fix #433: _ssr no longer produces output" made on Apr 13, 2016 we get the current behavior ... get the result ... and get it again when hitting "Enter" But you also get a display of each function as it is defined.

  m:{a:0 1 2 3 4 x}
{a:0 1 2 3 4 x}
  r:{m x}
{m x}
  r 2
2

2

2
tavmem commented 4 years ago

I found out what happens: Add the following line

 git diff
diff --git a/src/kc.c b/src/kc.c
index 1e2e00f..c5f67e4 100644
--- a/src/kc.c
+++ b/src/kc.c
@@ -278,6 +278,7 @@ I line(FILE*f, S*a, I*n, PDA*p)     //just starting or just executed: *a=*n=*p=0
     if(o&&k)O("Elapsed: %.7f\n",d);
   #endif
  next:
+  O("check NIL: ");show(NIL);O("\n");
   if(o && fam && !feci)show(k);
   cd(k);
  cleanup:
$ 

Then you get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
check NIL: 
  r:{m x}
check NIL: 
  r 2
check NIL: 2

2
                           // I hit "Enter" here
check NIL: 2

2

Somewhere, the value of NIL is being changed.

Now, we "only" have to find out "where" and "why".

tavmem commented 4 years ago

Found where it happens. Add the following 2 lines:

$ git diff
diff --git a/src/vd.c b/src/vd.c
index b35a4db..6bce6ef 100644
--- a/src/vd.c
+++ b/src/vd.c
@@ -154,7 +154,9 @@ K dot_ref(K *p, K *x, K *z, I s, K c, K y)
       cd(r);
     }
     else
+    { O("before-show(NIL):");show(NIL);O("\n");
     *p=r;
+    O("after-show(NIL):");show(NIL);O("\n"); }
     R NULL;
   }
   //these may turn out to be the "ELSE" case
$ 

Then you get

kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
before-show(NIL):
after-show(NIL):
  r:{m x}
before-show(NIL):
after-show(NIL):
  r 2
before-show(NIL):
after-show(NIL):2

2
                           // I hit "Enter" here  
2

Still need: "Why" does that cause the problem & "How" do we fix it?

tavmem commented 4 years ago

Beginning to address "Why" Add the following 2 lines

$ git diff
diff --git a/src/vd.c b/src/vd.c
index b35a4db..294d496 100644
--- a/src/vd.c
+++ b/src/vd.c
@@ -9,6 +9,8 @@

 /* dot monadic, dyadic, triadic, tetradic */ 
+extern K sd(K x);

 Z K dot_ref(K *p,K *x,K *z,I s,K c,K y);
 Z K makeable(K a);

@@ -128,6 +130,7 @@ K dot(K a, K b) //NB: b can be a cheating 0-type with NULLs .. ?
 //TODO: catch oom errors etc.
 K dot_ref(K *p, K *x, K *z, I s, K c, K y)
 {
+  O("p: %p     &NIL: %p\n",p,&NIL);  O("sd(*p): ");sd(*p);O("\n");
   K d=*p, f=x?*x:0;
   I dt=d->t, dn=countI(d), ft=999, fn, yn0=0;

You get

kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
p: 0x7f1ede0a0760     &NIL: 0x4516f0
sd(*p):      0x7f1ede0a0040 0x7f1ede0a0058            12-6 6 0   

  r:{m[x]}                                                 // with brackets
p: 0x7f1ede0a06a0     &NIL: 0x4516f0
sd(*p):      0x7f1ede0a0040 0x7f1ede0a0058            15-6 6 0   

  r 2
p: 0x7f1ede074520     &NIL: 0x4516f0
sd(*p):      0x7f1ede074580 0x7f1ede074598            1-6 6 0   

2
  \\
$  
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
p: 0x7f0cd2a32760     &NIL: 0x4516f0
sd(*p):      0x7f0cd2a32040 0x7f0cd2a32058            12-6 6 0   

  r:{m x}                                                  // without brackets
p: 0x7f0cd2a326a0     &NIL: 0x4516f0
sd(*p):      0x7f0cd2a32040 0x7f0cd2a32058            15-6 6 0   

  r 2
p: 0x4516f0     &NIL: 0x4516f0
sd(*p):      0x7f0cd2a32040 0x7f0cd2a32058            16-6 6 0   

2

The parameter (K *p) fed to the function (dot_ref) is is always "NULL" in both cases. But in the "without brackets" case, p becomes identical to &NIL.

tavmem commented 4 years ago

The function dot_ref is called by dot_tetradic_2 Adding the 2 lines

$ git diff
diff --git a/src/vd.c b/src/vd.c
index b35a4db..58f29ec 100644
--- a/src/vd.c
+++ b/src/vd.c
@@ -9,6 +9,8 @@

 /* dot monadic, dyadic, triadic, tetradic */

+extern K sd(K x);

 Z K dot_ref(K *p,K *x,K *z,I s,K c,K y);
 Z K makeable(K a);

@@ -223,6 +225,7 @@ K dot_ref(K *p, K *x, K *z, I s, K c, K y)

 K dot_tetradic_2(K *g, K b, K c, K y)
 {
+  O("g: %p     &NIL: %p\n",g,&NIL);  O("sd(*g): ");sd(*g);O("\n");
   if(c->t==7 && kK(c)[CODE]->t==-4)
   { V q=kV(kS(c)[CODE])[0];
     if(q>(V)500)R SYE;

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
g: 0x7fa4aa20e760     &NIL: 0x4516f0
sd(*g):      0x7fa4aa20e040 0x7fa4aa20e058            12-6 6 0   

  r:{m[x]}                                                       // with brackets
g: 0x7fa4aa20e6a0     &NIL: 0x4516f0
sd(*g):      0x7fa4aa20e040 0x7fa4aa20e058            15-6 6 0   

  r 2
g: 0x7fa4aa1e2520     &NIL: 0x4516f0
sd(*g):      0x7fa4aa1e2580 0x7fa4aa1e2598            1-6 6 0   

2
  \\
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
g: 0x7f0097826760     &NIL: 0x4516f0
sd(*g):      0x7f0097826040 0x7f0097826058            12-6 6 0   

  r:{m x}                                                        // without brackets
g: 0x7f00978266a0     &NIL: 0x4516f0
sd(*g):      0x7f0097826040 0x7f0097826058            15-6 6 0   

  r 2
g: 0x4516f0     &NIL: 0x4516f0
sd(*g):      0x7f0097826040 0x7f0097826058            16-6 6 0   

2

Similarly (in the without brackets case), dot_tetradic_2 ultimately passes a NULL value g that has the identical address as NIL.

tavmem commented 4 years ago

The function dot_tetradic_2 is called by ex2 Adding the single line

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..dd1bebd 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -799,6 +799,7 @@ K ex1(V*w,K k,I*i,I n,I f)//convert verb pieces (eg 1+/) to seven-types,

 Z K ex2(V*v, K k)  //execute words --- all returns must be Ks. v: word list, k: conjunction?
 { K t0,t2,t3,e,u; I i=0; ft3=0;
+  O("*v: %p    &NIL: %p\n",*v,&NIL);
   //TODO: is this messed up ......we can't index like this for (|-+) ?? what about 0-NULL []
   //ci(k) was R 0; ...  put this here for f/[x;y;z]
   if(!v || !*v) R k?(1==k->n)?ci(kK(k)[0]):ci(k):(K)(L)DT_END_OFFSET;

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
*v: 0x7f8a2357e760    &NIL: 0x4516f0
*v: 0x7f8a2357e920    &NIL: 0x4516f0
  r:{m[x]}                                                 // with brackets
*v: 0x7f8a2357e6a0    &NIL: 0x4516f0
*v: 0x7f8a2357ef20    &NIL: 0x4516f0
  r 2
*v: 0x7f8a2357e6a0    &NIL: 0x4516f0
*v: 0x7f8a2357eda0    &NIL: 0x4516f0
*v: 0x7f8a235521e0    &NIL: 0x4516f0
*v: 0x7f8a2357ece0    &NIL: 0x4516f0
*v: 0x7f8a2357e760    &NIL: 0x4516f0
*v: 0x7f8a23552520    &NIL: 0x4516f0
*v: 0x7f8a23552760    &NIL: 0x4516f0
*v: 0x7f8a23552420    &NIL: 0x4516f0
2
  \\
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
*v: 0x7faf5f667760    &NIL: 0x4516f0
*v: 0x7faf5f667920    &NIL: 0x4516f0
  r:{m x}                                                  // without brackets
*v: 0x7faf5f6676a0    &NIL: 0x4516f0
*v: 0x7faf5f667ce0    &NIL: 0x4516f0
  r 2
*v: 0x7faf5f6676a0    &NIL: 0x4516f0
*v: 0x7faf5f667b20    &NIL: 0x4516f0
*v: 0x7faf5f667760    &NIL: 0x4516f0
*v: 0x7faf5f667c60    &NIL: 0x4516f0
*v: 0x4516f0    &NIL: 0x4516f0                  // this is the problem call
*v: 0x7faf5f63b6a0    &NIL: 0x4516f0
*v: 0x7faf5f63b360    &NIL: 0x4516f0
2

There is one call of ex2 (in the without brackets case) where *v is identical to &NIL.

tavmem commented 4 years ago

The problematic call to ex2 is made by ex1 which is called by ex0 which is called by ex_ which is called by ex which is called by vf_ex in the line ci(fw); stk1++; z=ex(fw); stk1--;

Making these changes to the codebase (comment out 1 line, add 1 line):

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..915d0a7 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -63,7 +63,7 @@ K sd_(K x,I f)
            calf--; )
     CS(-4, if(1)       //(f>-1)
            { V* v=(kV(x));
-             if((v[0]>(V)0x10) & (v[0]<(V)0x5000000)) R 0; //stop, if have string of interned symbols
+             //if((v[0]>(V)0x10) & (v[0]<(V)0x5000000)) R 0; //stop, if have string of interned symbols
              I ii; for(ii=0;v[ii];ii++)
                    { O("     .2%c[%lld]: %p",alf[calf],ii,v[ii]);
                      if(v[ii]>(V)DT_SIZE){ if(calf<1)sd_(*(K*)v[ii],2); else sd_(*(K*)v[ii],1); }
@@ -534,6 +534,7 @@ K vf_ex(V q, K g)
             else{ tc=kclone(tree); fw=wd_(kC(o),o->n,&tc,fc); }
             kV(f)[CACHE_WD]=fw; cd(fc); }
           if(stk1>1e3) { cd(g); kerr("stack"); R _n(); }
+          O("&NIL: %p\n",&NIL); O("sd_(fw,2):");sd_(fw,2);O("\n");
           ci(fw); stk1++; z=ex(fw); stk1--;
           DO(p->n, e=EVP(DI(tree,i)); cd(*e); *e=0; )
           stk--; ) }

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m[x]}                                       // with brackets
  r 2
&NIL: 0x4516f0
sd_(fw,2):     0x7f7184b04380 0x7f7184b04398            1-7 7 0   
     a0:    0x7f7184b04398     .k
     a1:    0x7f7184b043a0     (nil)
     a2:    0x7f7184b043a8     0x7f7184b036c0 0x7f7184b036d8            1-6 -4 5   `"�5��q" 0x3c ` `"�4��q" (nil)  
     .2a[0]: 0x7f7184b03520     0x7f7184b03580 0x7f7184b03598            1-6 6 0   
     .2a[1]: 0x3c
     .2a[2]: 0x7f7184b03760     0x7f7184b03700 0x7f7184b03718            1-6 -1 5   0 1 2 3 4
     .2a[3]: 0x7f7184b03420     0x7f7184b03480 0x7f7184b03498            1-6 1 1   2
     a3:    0x7f7184b043b0     0x7f7184b03600 0x7f7184b03618            1-6 5 1   
.,(`
   0 1 2 3 4
   )
 0x7f7184b03618     0x7f7184b03740 0x7f7184b03758            1-6 0 3   
(`
 0 1 2 3 4
 )
 0x7f7184b03768     0x7f7184b03780 0x7f7184b03798 0x15c7290  1-6 4 1   `
 0x7f7184b03760     0x7f7184b03700 0x7f7184b03718            1-6 -1 5   0 1 2 3 4
 0x7f7184b03758     0x7f7184b2f040 0x7f7184b2f058            20-6 6 0   
     a4:    0x7f7184b043b8     0x7f7184b03640 0x7f7184b03658            1-6 5 0   
.()
     a5:    0x7f7184b043c0     
     a6:    0x7f7184b043c8     
     a7:    0x7f7184b043d0 
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}                                        // without brackets
  r 2
&NIL: 0x4516f0
sd_(fw,2):     0x7f75c41da280 0x7f75c41da298            1-7 7 0   
     a0:    0x7f75c41da298     .k
     a1:    0x7f75c41da2a0     (nil)
     a2:    0x7f75c41da2a8     0x7f75c41d9600 0x7f75c41d9618            1-6 -4 5   `"@P �u" 0x3c `"@��u" `"���u" (nil)  
     .2a[0]: 0x4516f0     0x7f75c4205040 0x7f75c4205058            16-6 6 0   
     .2a[1]: 0x3c
     .2a[2]: 0x7f75c41d96a0     0x7f75c41d9640 0x7f75c41d9658            1-6 -1 5   0 1 2 3 4
     .2a[3]: 0x7f75c41d9360     0x7f75c41d93c0 0x7f75c41d93d8            1-6 1 1   2
     a3:    0x7f75c41da2b0     0x7f75c41d9540 0x7f75c41d9558            1-6 5 1   
.,(`
   0 1 2 3 4
   )
 0x7f75c41d9558     0x7f75c41d9680 0x7f75c41d9698            1-6 0 3   
(`
 0 1 2 3 4
 )
 0x7f75c41d96a8     0x7f75c41d96c0 0x7f75c41d96d8 0xde5290  1-6 4 1   `
 0x7f75c41d96a0     0x7f75c41d9640 0x7f75c41d9658            1-6 -1 5   0 1 2 3 4
 0x7f75c41d9698     0x7f75c4205040 0x7f75c4205058            16-6 6 0   
     a4:    0x7f75c41da2b8     0x7f75c41d9580 0x7f75c41d9598            1-6 5 0   
.()
     a5:    0x7f75c41da2c0     
     a6:    0x7f75c41da2c8     
     a7:    0x7f75c41da2d0 

The important part is the 2 lines labelled .2a[0] In the without bracket case, the address of the NULL is identical to &NIL. Since fw is the parameter for the call to ex which is the beginning of the execution module, the problem is caused earlier, and is probably in the parser module.

tavmem commented 4 years ago

So ... where is the parser module called to create fw? Make 2 more changes to the codebase in addition to the ones made just above:

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..21e1af0 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -63,7 +63,7 @@ K sd_(K x,I f)
            calf--; )
     CS(-4, if(1)       //(f>-1)
            { V* v=(kV(x));
-             if((v[0]>(V)0x10) & (v[0]<(V)0x5000000)) R 0; //stop, if have string of interned symbols
+             //if((v[0]>(V)0x10) & (v[0]<(V)0x5000000)) R 0; //stop, if have string of interned symbols
              I ii; for(ii=0;v[ii];ii++)
                    { O("     .2%c[%lld]: %p",alf[calf],ii,v[ii]);
                      if(v[ii]>(V)DT_SIZE){ if(calf<1)sd_(*(K*)v[ii],2); else sd_(*(K*)v[ii],1); }
@@ -530,10 +530,11 @@ K vf_ex(V q, K g)
             { if(kC(o)[i]=='{')
               { tt=1;
                 if(kC(o)[i+1]==':'){ ttt=1; break; } } }
-            if(!ttt && (!grnt || tt || kC(o)[0]=='[')) fw=wd_(kC(o),o->n,&tree,fc);
-            else{ tc=kclone(tree); fw=wd_(kC(o),o->n,&tc,fc); }
+            if(!ttt && (!grnt || tt || kC(o)[0]=='[')) { O("if\n"); fw=wd_(kC(o),o->n,&tree,fc) ;}
+            else{ O("else\n"); tc=kclone(tree); fw=wd_(kC(o),o->n,&tc,fc); }
             kV(f)[CACHE_WD]=fw; cd(fc); }
           if(stk1>1e3) { cd(g); kerr("stack"); R _n(); }
+          O("&NIL: %p\n",&NIL); O("sd_(fw,2):");sd_(fw,2);O("\n");
           ci(fw); stk1++; z=ex(fw); stk1--;
           DO(p->n, e=EVP(DI(tree,i)); cd(*e); *e=0; )
           stk--; ) }

fw is created in either the if line or the else line. Running this we get

else
&NIL: 0x4516f0
sd_(fw,2):     0x7f612f2cb280 0x7f612f2cb298            1-7 7 0   
     a0:    0x7f612f2cb298     .k
     a1:    0x7f612f2cb2a0     (nil)
     a2:    0x7f612f2cb2a8     0x7f612f2ca600 0x7f612f2ca618            1-6 -4 5   `"@`//a" 0x3c `"@�,/a" `"��,/a" (nil)  
     .2a[0]: 0x4516f0     0x7f612f2f6040 0x7f612f2f6058            16-6 6 0   
     .2a[1]: 0x3c
     .2a[2]: 0x7f612f2ca6a0     0x7f612f2ca640 0x7f612f2ca658            1-6 -1 5   0 1 2 3 4
     .2a[3]: 0x7f612f2ca360     0x7f612f2ca3c0 0x7f612f2ca3d8            1-6 1 1   2
     a3:    0x7f612f2cb2b0     0x7f612f2ca540 0x7f612f2ca558            1-6 5 1   
.,(`
   0 1 2 3 4
   )
 0x7f612f2ca558     0x7f612f2ca680 0x7f612f2ca698            1-6 0 3   
(`
 0 1 2 3 4
 )
 0x7f612f2ca6a8     0x7f612f2ca6c0 0x7f612f2ca6d8 0x7bc290  1-6 4 1   `
 0x7f612f2ca6a0     0x7f612f2ca640 0x7f612f2ca658            1-6 -1 5   0 1 2 3 4
 0x7f612f2ca698     0x7f612f2f6040 0x7f612f2f6058            16-6 6 0   
     a4:    0x7f612f2cb2b8     0x7f612f2ca580 0x7f612f2ca598            1-6 5 0   
.()
     a5:    0x7f612f2cb2c0     
     a6:    0x7f612f2cb2c8     
     a7:    0x7f612f2cb2d0 

The else line does it. wd_ is the parser module, and fw=wd_(kC(o),o->n,&tc,fc);

where (from src/p.c) the signature is K wd_(S s, int n, K*dict, K func) //parse: s input string, n length;

tavmem commented 4 years ago

We believe that the problem is in the parser. If we add this line to the codebase

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..3736eb3 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -530,6 +530,7 @@ K vf_ex(V q, K g)
             { if(kC(o)[i]=='{')
               { tt=1;
                 if(kC(o)[i+1]==':'){ ttt=1; break; } } }
+            O("kC(o): %s    o->n: %lld\n",kC(o),o->n);
             if(!ttt && (!grnt || tt || kC(o)[0]=='[')) fw=wd_(kC(o),o->n,&tree,fc);
             else{ tc=kclone(tree); fw=wd_(kC(o),o->n,&tc,fc); }
             kV(f)[CACHE_WD]=fw; cd(fc); }

then, we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}
  r 2
kC(o): m x    o->n: 3
kC(o): a:0 1 2 3 4 x    o->n: 13
2

We know from the prior tests that it we are interested in the 2nd time this code is executied. We know (13 = o->n) in both cases. So, if we (instead) add these 2 lines to the codebase, we can check if the parser inputs differ.

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..ae9b860 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -530,6 +530,8 @@ K vf_ex(V q, K g)
             { if(kC(o)[i]=='{')
               { tt=1;
                 if(kC(o)[i+1]==':'){ ttt=1; break; } } }
+            if(13==o->n){ O("kC(o): %s    o->n: %lld    &tree: %p\n",kC(o),o->n,&tree);
+                          O("sd(tree):");sd(tree); O("sd(fc):");sd(fc); }
             if(!ttt && (!grnt || tt || kC(o)[0]=='[')) fw=wd_(kC(o),o->n,&tree,fc);
             else{ tc=kclone(tree); fw=wd_(kC(o),o->n,&tc,fc); }
             kV(f)[CACHE_WD]=fw; cd(fc); }

Then, we can test "with brackets" and "without brackets".

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m[x]}                                      // with brackets
  r 2
kC(o): a:0 1 2 3 4 x    o->n: 13    &tree: 0x7ffe73697850
sd(tree):     0x7f1c69960e80 0x7f1c69960e98            1-6 5 2   
.((`x;2;)
  (`a;;))
sd(fc):     0x7f1c69935000 0x7f1c69935018            1-7 7 3   {a:0 1 2 3 4 x}
2
  \\
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}                                       // without brackets
  r 2
kC(o): a:0 1 2 3 4 x    o->n: 13    &tree: 0x7ffea147b370
sd(tree):     0x7fb4506ec0c0 0x7fb4506ec0d8            1-6 5 2   
.((`x;2;)
  (`x;2;))
sd(fc):     0x7fb4506ed000 0x7fb4506ed018            1-7 7 3   {a:0 1 2 3 4 x}
2

The contents of the tree differ.

tavmem commented 4 years ago

The problem may or may not be the parser ... but, one of the parser inputs (tree) differs. tree initially comes from kV(f)[CACHE_TREE] Adding the following line to the codebase:

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..3cfa8d6 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -508,6 +508,7 @@ K vf_ex(V q, K g)
           if(stk > 2e6) { kerr("stack"); GC; }
           stk++;
           I j=0; K*e; K fw;
+          O("sd(kV(f)[CACHE_TREE]):");sd(kV(f)[CACHE_TREE]);
           if(!(tree=kV(f)[CACHE_TREE]))   //could merge this and and CACHE_WD check by duplicating the arg merge DO
           { tree=newK(5,p->n+s->n); if(!tree) { stk--; GC; } //note: cleanup is unusual -- could turn into double labels
             DO(tree->n, if(!(kK(tree)[i]=newK(0,3))) { cd(tree); stk--; GC;} ) //shallow dict copy -- dictionary entry pool?

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m[x]}                         // with brackets
  r 2
sd(kV(f)[CACHE_TREE]):     
sd(kV(f)[CACHE_TREE]):     
2
  \\
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}                         // without brackets
  r 2
sd(kV(f)[CACHE_TREE]):     
sd(kV(f)[CACHE_TREE]):     0x7f904bbc50c0 0x7f904bbc50d8            1-6 5 2   
.((`x;;)
  (`x;2;))
2

That line gets executed twice. kV(f)[CACHE_TREE] is null both times in the "with brackets" case. kV(f)[CACHE_TREE] is NOT null the 2nd time in the "without brackets" case. Why?

tavmem commented 4 years ago

f is created in vf_ex in the line K f=(K)(*(V*)q); where q was a parameter in K vf_ex(V q, K g)

Adding these 3 lines at the beginning of vf_ex

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..3f0b120 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -402,6 +402,9 @@ K dv_ex(K a, V *p, K b)

 K vf_ex(V q, K g)
 { K tc=0;
+  O("\nsd( (K)(*(V*)q) ):");sd( (K)(*(V*)q) );
+  if(7==((K)(*(V*)q))->t)
+  { O("sd(kV((K)(*(V*)q))[CACHE_TREE]):");sd(kV((K)(*(V*)q))[CACHE_TREE]); }
   if(interrupted){ interrupted=0; R BE; }
   if(!g) R 0; //??? R w converted to type7...or ?
   K z=0; U(g=promote(g))  I gn=g->n,k=sva(q),n=-1,j=0;

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m[x]}                 // with brackets
  r 2

sd( (K)(*(V*)q) ):     0x7fb89c759180 0x7fb89c759198            3-7 7 3   {m[x]}
sd(kV((K)(*(V*)q))[CACHE_TREE]):     

sd( (K)(*(V*)q) ):     0x7fb89c759080 0x7fb89c759098            3-7 7 3   {a:0 1 2 3 4 x}
sd(kV((K)(*(V*)q))[CACHE_TREE]):     

sd( (K)(*(V*)q) ):     0x7fb89c758700 0x7fb89c758718            2-6 -1 5   0 1 2 3 4
2
  \\
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}                  // without brackets
  r 2

sd( (K)(*(V*)q) ):     0x7f14fc759180 0x7f14fc759198            3-7 7 3   {m x}
sd(kV((K)(*(V*)q))[CACHE_TREE]):     

sd( (K)(*(V*)q) ):     0x7f14fc759080 0x7f14fc759098            3-7 7 3   {a:0 1 2 3 4 x}
sd(kV((K)(*(V*)q))[CACHE_TREE]):     0x7f14fc7580c0 0x7f14fc7580d8            1-6 5 2   
.((`x;;)
  (`x;2;))

sd( (K)(*(V*)q) ):     0x7f14fc758640 0x7f14fc758658            2-6 -1 5   0 1 2 3 4
2

This shows vf_ex is called 3 times. The aberrant CACHE_TREE is passed to vf_ex in the parameter q on the 2nd call.

tavmem commented 4 years ago

vf_ex is called by dv_ex. It's signature is K dv_ex(K a, V *p, K b). Adding these 3 lines to the codebase

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..3487bb1 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -310,6 +310,9 @@ Z K eachpair2(K a, V *p, K b)  //2==k necessary?
 //TODO: consider merging dv_ex with vf_ex
 K dv_ex(K a, V *p, K b)
 { if(!p || !*p) R 0; //TODO: ???
+  O("\nsd( *(K*)p[0] ):");sd( *(K*)p[0] );
+  if(7==( *(K*)p[0] )->t)
+  { O("sd( *(K*)p[0] )[CACHE_TREE]):"); sd( kV(*(K*)p[0])[CACHE_TREE] ); }
   U(b)  V *o=p-1;
   //Arity of V?A_1...A_n-1 for X V?A_1...A_n Y; 0 for X Y, X A Y
   I k=0; K w;

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m[x]}                               // with brackets
  r 2

sd( *(K*)p[0] ):     0x7f05f13d7180 0x7f05f13d7198            3-7 7 3   {m[x]}
sd( *(K*)p[0] )[CACHE_TREE]):     

sd( *(K*)p[0] ):     0x7f05f13d6700 0x7f05f13d6718            2-6 -1 5   0 1 2 3 4
2
  \\
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}                               // without brackets
  r 2

sd( *(K*)p[0] ):     0x7f48b7713180 0x7f48b7713198            3-7 7 3   {m x}
sd( *(K*)p[0] )[CACHE_TREE]):     

sd( *(K*)p[0] ):     0x7f48b7713080 0x7f48b7713098            3-7 7 3   {a:0 1 2 3 4 x}
sd( *(K*)p[0] )[CACHE_TREE]):     0x7f48b77120c0 0x7f48b77120d8            1-6 5 2   
.((`x;;)
  (`x;2;))

sd( *(K*)p[0] ):     0x7f48b7712640 0x7f48b7712658            2-6 -1 5   0 1 2 3 4
2

We see that the aberrant CACHE_TREE is passed to dv_ex as the first element of the parameter p. dv_ex is called by ex2 whose signature is K ex2(V*v, K k) We will examine ex2 next.

tavmem commented 4 years ago

Adding these 3 lines to the codebase

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..0315073 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -802,6 +802,9 @@ Z K ex2(V*v, K k)  //execute words --- all returns must be Ks. v: word list, k:
   //TODO: is this messed up ......we can't index like this for (|-+) ?? what about 0-NULL []
   //ci(k) was R 0; ...  put this here for f/[x;y;z]
   if(!v || !*v) R k?(1==k->n)?ci(kK(k)[0]):ci(k):(K)(L)DT_END_OFFSET;
+  if(7==( *(K*)v[0] )->t)
+  { O("sd( *(K*)v[0] ):");sd( *(K*)v[0] );
+    O("sd( *(K*)v[0] )[CACHE_TREE]):"); sd( kV( *(K*)v[0] )[CACHE_TREE] ); O("\n"); }
     //? '1 + _n' -> domain err, '1 +' -> 1+ . but '4: . ""' -> 6
   if(bk(*v)) R *v;  // ; case
   if(!v[1] && !k)

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
sd( *(K*)v[0] ):     0x7f3aee1a3080 0x7f3aee1a3098            1-7 7 3   {a:0 1 2 3 4 x}
sd( *(K*)v[0] )[CACHE_TREE]):     

  r:{m[x]}
sd( *(K*)v[0] ):     0x7f3aee1a3180 0x7f3aee1a3198            1-7 7 3   {m[x]}
sd( *(K*)v[0] )[CACHE_TREE]):     

  r 2
sd( *(K*)v[0] ):     0x7f3aee1a3180 0x7f3aee1a3198            1-7 7 3   {m[x]}
sd( *(K*)v[0] )[CACHE_TREE]):     

sd( *(K*)v[0] ):     0x7f3aee1a3280 0x7f3aee1a3298            1-7 7 0   
sd( *(K*)v[0] )[CACHE_TREE]):     

sd( *(K*)v[0] ):     0x7f3aee1a3080 0x7f3aee1a3098            1-7 7 3   {a:0 1 2 3 4 x}
sd( *(K*)v[0] )[CACHE_TREE]):     

2
  \\
$ 
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
sd( *(K*)v[0] ):     0x7f2a955ef080 0x7f2a955ef098            1-7 7 3   {a:0 1 2 3 4 x}
sd( *(K*)v[0] )[CACHE_TREE]):     

  r:{m x}
sd( *(K*)v[0] ):     0x7f2a955ef180 0x7f2a955ef198            1-7 7 3   {m x}
sd( *(K*)v[0] )[CACHE_TREE]):     

  r 2
sd( *(K*)v[0] ):     0x7f2a955ef180 0x7f2a955ef198            1-7 7 3   {m x}
sd( *(K*)v[0] )[CACHE_TREE]):     

sd( *(K*)v[0] ):     0x7f2a955ef080 0x7f2a955ef098            1-7 7 3   {a:0 1 2 3 4 x}
sd( *(K*)v[0] )[CACHE_TREE]):     

2

All occurrences of [CACHE_TREE] for {a:0 1 2 3 4 x} are identical and are NULL. This indicates that the problem arises somewhere in Z K ex2(V*v, K k), after ex2 is called, and before it calls dv_ex.

Note that after the command r 2 the "with brackets" case shows 3 calls to ex2 where v[0] is type-7, but the ``without brackets" case only shows 2. That could be a clue (but the additional call is probably just handling the brackets).

tavmem commented 4 years ago

Found where [CACHE_TREE] is created. Add the following 6 lines to the codebase

$ git diff
diff --git a/src/kx.c b/src/kx.c
index 024a1b2..1e9d117 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -900,6 +900,9 @@ Z K ex2(V*v, K k)  //execute words --- all returns must be Ks. v: word list, k:
   while(adverbClass(v[1+i])) i++; //ALT'Y: i=adverbClass(b)?i+1:0;
   t2=ex2(v+1+i,k); //oom. these cannot be placed into single function call b/c order of eval is unspecified
   t3=ex_(*v,1); if(t3<(K)DT_SIZE)ft3=1;
+  if(7==t3->t && 13==(kK(t3)[CODE])->n)
+  { O("\nbefore-sd(t3):");sd(t3);
+    O("sd( kV(t3)[CACHE_TREE] ):"); sd( kV(t3)[CACHE_TREE] ); }
   if(t3>(K)DT_SIZE && t3->t==7 && t3->n==3)
   { if(kV(t3)==kV(grnt))
     { if(cls) cd(cls);
@@ -928,6 +931,9 @@ Z K ex2(V*v, K k)  //execute words --- all returns must be Ks. v: word list, k:
       else grnt=prnt; }
     prnt=ci(t3); }
   u=*v; //Fixes a bug, see above. Not thread-safe. Adding to LOCALS probably better
+  if(7==t3->t && 13==(kK(t3)[CODE])->n)
+  { O("\nafter--sd(t3):");sd(t3);
+    O("sd( kV(t3)[CACHE_TREE] ):");sd(kV(t3)[CACHE_TREE]);O("\n"); }
   *v=VA(t3)?t3:(V)&t3;
   if(*(v+i)==(V)offsetEach && !grnt) grnt=ci(prnt);
   e=dv_ex(0,v+i,t2); *v=u;

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m x}
  r 2

before-sd(t3):     0x7f16c3717080 0x7f16c3717098            2-7 7 3   {a:0 1 2 3 4 x}
sd( kV(t3)[CACHE_TREE] ):     

after--sd(t3):     0x7f16c3717080 0x7f16c3717098            3-7 7 3   {a:0 1 2 3 4 x}
sd( kV(t3)[CACHE_TREE] ):     0x7f16c37160c0 0x7f16c37160d8            1-6 5 2   
.((`x;;)
  (`x;2;))

2

t3 is created immediately preceeding display of the "before" state. t3 is used to create *v immediately following display of the "after" state, which is used in the call to dv_ex.

Note, that when we try the "with brackets" case, these code sections are not executed at all.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  m:{a:0 1 2 3 4 x}
  r:{m[x]}
  r 2
2