kevinlawler / kona

Open-source implementation of the K programming language
ISC License
1.36k stars 138 forks source link

Recursive call overrides earlier data #521

Closed hoosierEE closed 4 years ago

hoosierEE commented 6 years ago

I'm trying to parse an expression tree like f(g(), h()) using recursive descent, but instead of getting a tree like this:

   f
 /   \
g     h

I get something more like this:

   h
 / 
h

This is the full source code:

tokens: ((`id;"F")
(`oparen;"(")
(`id;"G")
(`oparen;"(")
(`cparen;")")
(`comma;",")
(`id;"H")
(`oparen;"(")
(`cparen;")")
(`cparen;")"))

peek: {x~tokens[;0][0]}
eat: {
  t: *tokens
  tokens :: 1 _ tokens
  if[~x~t[0]; \echo "no match for ",$x; : 0]
  t}

parse_call: {
  name:eat[`id]@1
  ae: parse_arg_exprs()
  .((`name;name);(`arg_exprs;ae))}

parse_arg_exprs: {
  arg_exprs:()
  eat[`oparen]
  if[~peek[`cparen]
    arg_exprs:,parse_call()
    while[peek[`comma]
      eat[`comma]
      arg_exprs,:,parse_call()]]
  eat[`cparen]
  arg_exprs}

parse_call()

And this is the result:

.((`name;"H";)
  (`arg_exprs
   ,.((`name;"H";)
      (`arg_exprs;();))
   ))

To me, it seems that the deeper nested call to parse_arg_exprs overwrites the "parent". Also, it either skips the last node or drops an earlier node, because the resulting tree is missing a leaf. Am I misunderstanding Kona scoping rules or is this a bug?

[edit] For completeness, here is what I expected to see as a correct result:

.((`name;"F";)
  (`arg_exprs
   .((`name;"G";)
      (`arg_exprs;();))
   .((`name;"H";)
      (`arg_exprs;();))
   ))
bakul commented 6 years ago

On Wed, 16 May 2018 11:01:28 -0700 Alex Shroyer notifications@github.com wrote:

I'm trying to parse an expression tree like f(g(), g()) using recursive des cent, but instead of getting a tree like this:

   f
 /   \
g     h

I get something more like this:

   h
 / 
h

This is the full source code:

tokens: ((`id;"F")
(`oparen;"(")
(`id;"G")
(`oparen;"(")
(`cparen;")")
(`comma;",")
(`id;"H")
(`oparen;"(")
(`cparen;")")
(`cparen;")"))

peek: {x~tokens[;0][0]}
eat: {
  t: *tokens
  tokens :: 1 _ tokens
  if[~x~t[0]; \echo "no match for ",$x; : 0]
  t}

parse_call: {
  name:eat[`id]@1
  ae: parse_arg_exprs()
  .((`name;name);(`arg_exprs;ae))}

parse_arg_exprs: {
  arg_exprs:()
  eat[`oparen]
  if[~peek[`cparen]
    arg_exprs:,parse_call()
    while[peek[`comma]
      eat[`comma]
      arg_exprs,:,parse_call()]]
  eat[`cparen]
  arg_exprs}

parse_call()

And this is the result:

Here's the full result:

.((`name;"H";)
  (`arg_exprs
   ,.((`name;"H";)
      (`arg_exprs;();))
   ))

To me, it seems that the deeper nested call to parse_arg_exprs overwrites t he "parent". Am I misunderstanding Kona scoping rules or is this a bug?

This is a bug.

hoosierEE commented 6 years ago

For what it's worth, this reminded me of a quirk (not a bug) of Python's scoping rules: https://stackoverflow.com/a/233835/2037637 . I wondered if a similar thing is going on in Kona.

bakul commented 6 years ago

On Wed, 16 May 2018 19:27:42 -0700 Alex Shroyer notifications@github.com wrote:

For what it's worth, this reminded me of a quirk (not a bug) of Python's scop ing rules link. I wondered if a similar thing is going on in Kona.

k3 (and kona by extension) variables have only global scope or local (function level) scope. Your code works as expected in k3.

tavmem commented 6 years ago

2 simple cases: This one fails:

$ cat tst1.k
sub:{a:2}
foo:{a:1; sub(); a}
foo()

$ rlwrap -n ./k tst1
kona      \ for help. \\ to exit.

This one works:

$ cat tst2.k
foo:{a:1; {a:2}; a}
foo()

$ rlwrap -n ./k tst2
kona      \ for help. \\ to exit.

1
hoosierEE commented 6 years ago

Kona console:

  {a:1;{a:3}();a}()
  {a:1;{b:3}();a}()
  {a:1;{b:3;}();a}()
1
  {a:1;{a:3;}();a}()
1

It seems like the inner function doesn't return anything (not even nil) and breaks the outer function.

hoosierEE commented 6 years ago

There are several oddities with assignment inside a void function:

  {a:2; {a:3}(); a}() / no output
  {a:2; {a:3;}(); a}() / returns 2
2
  f:{a:2; {a:3}(); a}() / no output, however f gets 2
  f
2
  g:{a:2; {a:3;}(); a}() / added semicolon
2
  g
2

@tavmem do you think this is the root cause of the bug I saw with the recursive functions?

tavmem commented 6 years ago

It could be. Hard to say for sure. It's probably best to try & fix the simple oddity cases and see if your recursive case becomes resolved.

tavmem commented 6 years ago

In a k2.8 console

$ rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
Evaluation. Not for commercial use. 
\ for help. \\ to exit.

  {a:2; {a:3}(); a}()
2
  {a:2; {a:3;}(); a}()
2
  f:{a:2; {a:3}(); a}()
  f
2
  g:{a:2; {a:3;}(); a}()
  g
2
tavmem commented 6 years ago

The first "oddity"

{a:2; {a:3}(); a}()     / no output

is a regression.

It first shows up in commit 0e6351523ba28def719fe44f1c994764ea57ac02 made on Aug 10, 2017 titled "suppress display on amend".

tavmem commented 6 years ago

First "oddity" is fixed.

The following cases are still problematic:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  f:{a:2; {a:3}(); a}()
2
  f
2
  g:{a:2; {a:3;}(); a}()
2
  g
2
tavmem commented 6 years ago

All 3 "oddities" are fixed. Your recursive case is not resolved.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  {a:2; {a:3}(); a}()
2
  f:{a:2; {a:3}(); a}()
  f
2
  g:{a:2; {a:3;}(); a}()
  g
2
tavmem commented 6 years ago

Here's the next attempt at a relevant "simplistic" case: In k2.8

  foo:{a:x; sub(a); a}
  sub:{:[x=0; a:x; foo(0)]}
  foo(2)
2

In kona:

  foo:{a:x; sub(a); a}
  sub:{:[x=0; a:x; foo(0)]}
  foo(2)
0
tavmem commented 6 years ago

This issue is not a regression. I went back to the code as of commit 611fdca7d77fd43fe1f30089cf18d8edef6b415b titled "Fix backslash commands with a leading space" made on June 23, 2011 and got similar results:

  foo:{a:x; sub(a); a}
{a:x; sub(a); a}
  sub:{:[x=0; a:x; foo(0)]}
{:[x=0; a:x; foo(0)]}
  foo(2)
0
bakul commented 6 years ago

I suspect the issue may be that the symbol table associated with the function definition is being modified, rather than each function instantiation getting its own copy of the symbol table. For each instantiation (when a function is called), a fresh dictionary needs to be created and pushed on the stack. When the function is exited, it is popped from the stack and destroyed.

tavmem commented 6 years ago

If the subroutine is redefined as a void function, the problem persists In k2.8:

  foo:{a:x; {:[x=0; a:x; foo(0)]}a; a}
  foo 2
2

In kona:

  foo:{a:x; {:[x=0; a:x; foo(0)]}a; a}
  foo 2
0
tavmem commented 6 years ago

@bakul: I agree with your diagnosis.

tavmem commented 6 years ago

However, if you make everything a void function, the problem disappears: in k2.8:

  {a:x; {:[x=0; a:x; _f(0)]}a; a}2
2

in kona:

  {a:x; {:[x=0; a:x; _f(0)]}a; a}2
2
tavmem commented 6 years ago

Just to document all variations: Using a hybrid approach (define foo, but use a void function at the top level), also works fine. in k2.8:

  foo:{a:x; {:[x=0; a:x; foo(0)]}a; a}
  {a:x; {:[x=0; a:x; foo(0)]}a; a}2
2

in kona:

  foo:{a:x; {:[x=0; a:x; foo(0)]}a; a}
  {a:x; {:[x=0; a:x; foo(0)]}a; a}2
2
tavmem commented 6 years ago

That begs the following question: Does the hybrid approach work on the original recursive case?

$ cat hybrid.k
tokens: ((`id;"F")
(`oparen;"(")
(`id;"G")
(`oparen;"(")
(`cparen;")")
(`comma;",")
(`id;"H")
(`oparen;"(")
(`cparen;")")
(`cparen;")"))

peek: {x~tokens[;0][0]}

eat: {
  t: *tokens
  tokens :: 1 _ tokens
  if[~x~t[0]; \echo "no match for ",$x; : 0]
  t}

parse_call: {
  name:eat[`id]@1
  ae: parse_arg_exprs()
  .((`name;name);(`arg_exprs;ae))}

parse_arg_exprs: {
  arg_exprs:()
  eat[`oparen]
  if[~peek[`cparen]
    arg_exprs:,parse_call()
    while[peek[`comma]
      eat[`comma]
      arg_exprs,:,parse_call()]]
  eat[`cparen]
  arg_exprs}

{name:eat[`id]@1; ae:parse_arg_exprs(); .((`name;name);(`arg_exprs;ae))}()

Yes ... in k2.8:

$ rlwrap -n ./k ~/k2.8/hybrid
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
Evaluation. Not for commercial use. 
\ for help. \\ to exit.

.((`name;"F";)
  (`arg_exprs
   (.((`name;"G";)
      (`arg_exprs;();))
    .((`name;"H";)
      (`arg_exprs;();)))
   ))

No ... in kona:

$ rlwrap -n ./k ~/k2.8/hybrid
kona      \ for help. \\ to exit.

type error

> 
tavmem commented 6 years ago

A better listing of simple cases that "Don't work" and that "Do work". (All the cases do work in k2.8)

Don't work:

  foo:{a:x; {:[x=0; a:x; foo(0)]}a; a}; foo(2)
0
  foo:{a:x; sub(a); a};  sub:{:[x=0; a:x; foo(0)]};  foo(2)
0

Do work:

  {a:x; {:[x=0; a:x; _f(0)]}a; a}2
2
  foo:{a:x; sub(a); a};  sub:{:[x=0; a:x; _f(0)]}; foo(2)
2
  foo:{a:x; {:[x=0; a:x; _f(0)]}a; a}; foo(2)
2
tavmem commented 5 years ago

It was interesting that this works in Kona: foo:{a:x; {:[x=0; a:x; _f(0)]}a; a}; foo(2) while this fails: foo:{a:x; {:[x=0; a:x; foo(0)]}a; a}; foo(2)

The reason is not what I expected. Kona seems to be interpreting _f(0) as a command to execute {:[x=0; a:x; _f(0)]}0 which is quite different than to execute {a:x; {:[x=0; a:x; _f(0)]}a; a}0

bakul commented 5 years ago

I think the real issue is early binding. The following snippet will behave differently in k3 and kona:

  f:{:[x>0;2*f[x-1];1]}
  g:f
  f:{0}
  g 4

In processing lambda expression on RHS, kona replaces the inner f with self reference. This changes the definition of g to a recursive one so redefining f later has no effect on g and g 4 yields 16. Not so in k3, where g 4 will yield 0. I alluded to this early binding issue in an old bug report (don’t recall which one now). To see the difference, compare the contents of .k for both interpreters after each definition. Do the same for mutually recursive functions.

tavmem commented 5 years ago

I agree that the "early binding" you describe is a problem that needs to be fixed (and probably should be added as a separate issue -- so we don't lose it yet again).

However, I think that the incorrect result in the simplified example that we have been discussing is somewhat different (but could also be described as a "binding" problem):

  f:{a:x; {:[x=0; a:x; f(0)]}a; a}
  .k
.,(`f;{a:x; {:[x=0; a:x; f(0)]}a; a};)

  f 2
0

Here, kona did not replace the inner f with self reference.

Consider the following (that gives the correct result):

  f:{a:x; {:[x=0; a:x; a:5]}a; a}
  f 2
2

The assignments to "a" in the lambda of the second example have no effect on the correct value of "a" reported by the outer "f".

In the first example, the two executions of "f" (inner & outer) should be completely independent. However, the value assigned to "a" by the inner f (in the lambda) gets "bound" to the value of "a" incorrectly reported by the outer "f".

tavmem commented 5 years ago

A diagnosis of the problem:

The function "f" is stored as a type 7. enum TYPE_SEVEN_MEMBERS {CONTEXT,DEPTH,CODE,LOCALS,PARAMS,CONJ,CACHE_WD,CACHE_TREE,TYPE_SEVEN_SIZE}; The initial occurrence looks like this, with the CACHE_TREE (a7) empty:

4 7 3   {a:x; {:[x=0; a:x; f(0)]}a; a}
        a0:        .k
        a1:        (nil)
        a2:        0x7fdb7b1ec898            1 -3 28   "a:x; {:[x=0; a:x; f(0)]}a; a"
        a3:        0x7fdb7b1ec818            1 5 1
   .,(`a;;)
        0x7fdb7b1ecc58            1 0 3
   (`a;;)
        0x7fdb7b1ec058            11 6 0
        0x7fdb7b1ec058            11 6 0
        0x7fdb7b1ecb58 0xc5d4c0  1 4 1   `a
        a4:        0x7fdb7b1ec858            1 5 1
   .,(`x;;)
        0x7fdb7b1ec998            1 0 3
   (`x;;)
        0x7fdb7b1ec058            11 6 0
        0x7fdb7b1ec058            11 6 0
        0x7fdb7b1ecdd8 0xc5a2b0  1 4 1   `x
        a5:
        a6:
        a7:

Without getting overly verbose, the CACHE_TREE goes through the following transitions: First, while processing the outer "f"

        a7:        0x7fdb7b1ec658            1 5 2
   .((`x;2;)
     (`a;;))
        a7:        0x7fdb7b1ec658            1 5 2
   .((`x;2;)
     (`a;2;))

Then, when processing the inner "f"

        a7:        0x7fdb7b1ec658            1 5 2
   .((`x;0;)
     (`a;2;))
        a7:        0x7fdb7b1ec658            1 5 2
   .((`x;0;)
     (`a;0;))

When getting back to the outer "f", the CACHE_TREE still contains the values from the inner "f".

bakul commented 5 years ago

Re: binding. Sorry for the confusion. Looks like you are on the right track in spite of that :-)

I modified the program to print before and after values and see interesting results!

Kona:

  foo:{a:x; `0:("x=",$x),"\n"; {:[x=0; a:x; foo(x-1)]}a; `0:"x=",($x),", a=",($a),"\n"; a}; foo(3)
x=3
x=2
x=1
x=0
x=0, a=0
type error
foo:{a:x; `0:("x=",$x),"\n"; {:[x=0; a:x; foo(x-1)]}a; `0:"x=",($x),", a=",($a),"\n"; a}; foo(3)
at execution instance 12 of "0:"

k3:

  foo:{a:x; `0:("x=",$x),"\n"; {:[x=0; a:x; foo(x-1)]}a; `0:"x=",($x),", a=",($a),"\n"; a}; foo(3)
x=3
x=2
x=1
x=0
x=0, a=0
x=1, a=1
x=2, a=2
x=3, a=3
3
tavmem commented 5 years ago

Yes ... thanks ... quite interesting. I wondered whether your example blew up on the "x=1" or on the "a=1". It's the "x=1".

k2.8

  f:{a:x; `0:("x=",$x),"\n"; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; `0:("a=",$a),"\n"; a}
  f 3
x=3
x=2
x=1
x=0
x=0
a=0
x=1
a=1
x=2
a=2
x=3
a=3
3

kona

  f:{a:x; `0:("x=",$x),"\n"; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; `0:("a=",$a),"\n"; a}
  f 3
x=3
x=2
x=1
x=0
x=0
a=0
type error
tavmem commented 5 years ago

Also interesting ... kona:

  f:{a:x; `0:("x=",$x),"\n"; {:[x=0; a:x; f(x-1)]}a; `0:("a=",$a),"\n"; a}
  f 3
x=3
x=2
x=1
x=0
a=0
a=0
a=0
a=0
0
tavmem commented 5 years ago

Simply in the interest of documentation, in kona:

  f:{a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}
  f 1
x=0
type error

The type error occurs in the function vf_ex contained in file ~/kona/src/kx.c

  if(2==k && a && b){ fnc=DT[(L)q].text;
    if(fnci<127){fncp[fnci]=q; fnci++;}
    if(cls && a->t==6) z=((K(*)(K,K))DT[(L)q].func)(cls,b);
    else z=((K(*)(K,K))DT[(L)q].func)(a,b);                 // WHEN EXECUTING THIS LINE
    GC; }
  //? (+).1 -> err ; {[a;b]a+b} 1 -> err
  if(2==k && !a){VE; GC;} //Reachable? Projection?
tavmem commented 5 years ago

more FYI ... the "func" called when executing the problematic line above is _0d() in ~/kona/src/0.c:

K _0d(K a,K b) {      //lfop
  I t=a->t;
  if(4==t || 3==ABS(t))R _0d_write(a,b);
  if(!t)R _0d_read(a,b);
  R TE; }                                 // THE TYPE ERROR IS PROBABLY THROWN HERE
tavmem commented 5 years ago

No ... "a" is type 4, which is OK. _0d_write()" is called.

Z K _0d_write(K a,K b) {     //assumes a->t in {3,-3,4}
  I t=b->t, n=b->n; K k;
  P(!ok_0dw(b),TE)                // THE TYPE ERROR IS THROWN HERE
  ...
tavmem commented 5 years ago

In summary:

  f:{a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}
  f 1
x=0
type error

fails with "type error" in ok_0dw() because, ultimately, x has no value at all (becomes nil).

  f:{a:x; {:[x=0; a:x; f(x-1)]}a; `0:("a=",$a),"\n"; a}
  f 1
a=0
a=0
0

does not fail with "type error", because "a" retains the value of 0. But that is the wrong result.

Seems that the values of both "x" and "a" need to be "pushed" onto the stack (at the appropriate place) and then restored by "popping" off the stack (at the appropriate place) to properly handle the recursion. The number of occurrences of the underlying values also may need to be adjusted.

It seems that the problem is now to find the appropriate places in the code to make these changes.

tavmem commented 5 years ago

To be more specific ... It's not the values that need to be pushed on / popped off the stack ... which is good since a value could be any K-structure, which could be quite large and complex. We need to push a pointer to the current values (K-structures) of "x" and "a", and increment the reference counts of each K-structure. Then at popping time, decrement the reference counts of the values currently associated with "x" and "a", pop the pointers from the stack, associate them with "x" and "a", and possibly adjust reference counts further.

Oh ... and of course ... do this in a way that is general enough to handle any sort of recursion.

bakul commented 5 years ago

One way to do this: each function instance will have its own symbol table (dictionary). Just prior to calling a function you create a new dict, and add x, y etc. with their argument values, push this dict on the dict stack and then call the function. On return from the function you pop & delete this dictionary. variable names are always looked up in the dict on top of the dict stack. Failing that you look in the global .k dict.

bakul commented 5 years ago

Note that there is one easy optimization available but only if you "compile" the function body. IIRC this is not being done today. The idea is to replace symbol table lookup with indexing. Thus {x+y} would compile to (using a symbolic notation) something like push $1; push $2; add, if your virtual machine is stack oriented. If you have code like {.k.a:12}, you have to add symbol a, initialize to a null, to the .k dictionary so that the compile code can replace looking up a in the .k dict with an index into .k's value vector.

This would speed up a lot of things and reduce allocation. But this is a somewhat bigger undertaking!

tavmem commented 5 years ago

Thanks for both ideas. I like the "compile" optimization, in that kona seems to go through the process of converting a function into byte code more often than appears necessary. However, kona also has the "early binding" problem you pointed out (documented in #537), and the "compile" optimization might exacerbate "early binding".

The approach that seems easiest to implement would be to augment the existing kona process. Each function already has its own symbol table (dictionary). It is the CACHE_TREE, which is item 7 of a type-7 structure used for a function. However, the CACHE_TREE is not processed properly for recursion.

Based on the examples we have examined so far, currently, any local variables (like "a") and parameters (like "x") are simply updated with new values at each new level of the recursion. When an instance of the function completes, it appears that local variables (like "a") are left with the last update, while parameters (like "x") are cleaned up and eliminated from the CACHE_TREE. This would explain the "type error" for "x" and the incorrect result value from "a".

It seems that adding a "push and pop" process to the CACHE_TREE for each execution instance of the function might solve the recursion problem.

tavmem commented 5 years ago

Some observations on the structure of kona and one possibility for fixing recursion:

kona repeats a conversion/execution cycle. It converts a command string into word-code, and then executes the word-code. The conversion part begins with a dictionary, which obviously, affects the execution part. Modifying the dictionary that begins each cycle with a "push" process (to retain the old dictionary) may be a key part of fixing recursion. (It's not yet clear where an associated "pop" might be.)

The command f:{a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}; f 1 goes through 5 conversion/execurion cycles.

The first conversion/execution

wd_(S s, int n, K*dict, K func)   
s:   f:{a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}; f 1
n:   58
dict:  6 6 0           // a null (with ref count = 6)
func:  0               // empty

The second conversion/execution

wd_(S s, int n, K*dict, K func)   
s:   a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a   
n:   49
dict:  1 5 2  (a dictionary with 2 members and ref count=1)
   .((`x;1;)
     (`a;;))
func:   1 7 3   (type 7-3, a brace function with ref count=1)   
   {a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}

The third conversion/execution

wd_(S s, int n, K*dict, K func)   
s:   :[x=0; a:x; f(x-1)]   
n:   19
dict:   1 5 2   (a dictionary with 2 members and ref count = 1)
   .((`x;1;)
     (`a;;))
func:   1 7 3   (type 7-3, a brace function with ref count=1)   
   {:[x=0; a:x; f(x-1)]}

The fourth conversion/execution

wd_(S s, int n, K*dict, K func)   
s:   a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a   
n:   49
dict:   1 5 2   (a dictionary with 2 members and ref count=1)
   .((`x;0;)
     (`a;1;))
func:   1 7 3  (type 7-3, a brace function with ref count=1   
   {a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}

The fifth conversion/execution

wd_(S s, int n, K*dict, K func)
s:   :[x=0; a:x; f(x-1)]   
n:   19
dict:   1 5 2   (a dictionary with 2 members and ref count=1)
   .((`x;0;)
     (`a;;))
func:   1 7 3   (type 7-3, a brace function with ref count=1)
   {:[x=0; a:x; f(x-1)]}
tavmem commented 5 years ago

A possible place to put the push/pop process may be in ~/kona/src/kx.c

fw=kV(f)[CACHE_WD]; I t=0;
if(!fw || (t=(UI)kS(kK(fw)[CODE])[0]>DT_SIZE || (UI)kS(kK(fw)[CODE])[1]>DT_SIZE) ) {
  if(t) cd(kV(f)[CACHE_WD]);
  K fc = kclone(f); //clone the function to pass for _f
  cd(kV(fc)[CONJ]); kV(fc)[CONJ]=0;
  kV(fc)[DEPTH]++;
  fw=wd_(kC(o),o->n,&tree,fc);             // push before this statement, pop after it
  kV(f)[CACHE_WD]=fw; cd(fc); }
tavmem commented 5 years ago

Probably better ... (again this code is in ~/kona/src/kx.c) Push a copy of CACHE_WD before the CONVERSION (or at latest, before the UPDATE) Pop a copy of CACHE_WD sometime after the EXECUTION (But note that the CONVERSION & UPDATE are only done conditionally.)

fw=kV(f)[CACHE_WD]; I t=0;
if(!fw || (t=(UI)kS(kK(fw)[CODE])[0]>DT_SIZE || (UI)kS(kK(fw)[CODE])[1]>DT_SIZE) ) {
  if(t) cd(kV(f)[CACHE_WD]);
  K fc = kclone(f); //clone the function to pass for _f
  cd(kV(fc)[CONJ]); kV(fc)[CONJ]=0;
  kV(fc)[DEPTH]++; 
  fw=wd_(kC(o),o->n,&tree,fc);        // the CONVERSION occurs here
  kV(f)[CACHE_WD]=fw; cd(fc); }       // the UPDATE of CACHE_WD occurs here

#ifdef DEBUG
if(stk1>5) {cd(g); kerr("stack"); R _n();}
#else
if(stk1>1e3) {cd(g); kerr("stack"); R _n();}
#endif
ci(fw); stk1++; 
z=ex(fw);                             // the EXECUTION occurs here
stk1--;
DO(p->n,e=EVP(DI(tree,i)); cd(*e); *e=0; )
stk--;
tavmem commented 5 years ago

The push/pop mechanism: CACHE_WD is a dictionary. Taking advantage of this may allow a simple implementation of push/pop. In k:

  dict: . ((`a;1);(`x;2))
  dict.a
1
  dict.x
2

  dict: . ((`a;3);(`x;4)), . dict     // PUSH
  dict.a
3
  dict.x
4

  dict: . ( 2 _ . dict)               // POP
  dict.a
1
  dict.x
2

Of course, this needs to be translated into C.

tavmem commented 5 years ago

Well ... that didn't work ... back to the drawing board.

First of all, the dictionary passed to wd_ is not "CACHE_WD", It is "tree".

Making the following 3 code changes in ~/kona/src/kx.c

@@ -608,11 +648,12 @@ K vf_ex(V q, K g)
       DO(p->n,e=EVP(DI(tree,i)); cd(*e); *e=0; if(r && i<r->n) *e=ci(kK(r)[i]); if(!*e && j<g->n) *e=ci(kK(g)[j++])) //merge in: CONJ with function args

       fw=kV(f)[CACHE_WD]; I t=0;
+      K tc;
       if(!fw || (t=(UI)kS(kK(fw)[CODE])[0]>DT_SIZE || (UI)kS(kK(fw)[CODE])[1]>DT_SIZE) ) {
         if(t) cd(kV(f)[CACHE_WD]);
         K fc = kclone(f); //clone the function to pass for _f
         cd(kV(fc)[CONJ]); kV(fc)[CONJ]=0;
-        kV(fc)[DEPTH]++; fw=wd_(kC(o),o->n,&tree,fc); kV(f)[CACHE_WD]=fw; cd(fc); }
+        kV(fc)[DEPTH]++; fw=wd_(kC(o),o->n,&tree,fc); O("Atree:\n");show(tree); tc=kclone(tree); kV(f)[CACHE_WD]=fw; cd(fc); }

       #ifdef DEBUG
       if(stk1>5) {cd(g); kerr("stack"); R _n();}
@@ -620,6 +661,7 @@ K vf_ex(V q, K g)
       if(stk1>1e3) {cd(g); kerr("stack"); R _n();}
       #endif
       ci(fw); stk1++; z=ex(fw); stk1--;
+      O("Btree:\n");show(tree); cd(tree); tree=tc; O("Ctree:\n");show(tree);
       DO(p->n,e=EVP(DI(tree,i)); cd(*e); *e=0; )
       stk--;
     )

kona gives the following result:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  f:{a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}
  f 1
Atree:
.((`x;1;)
  (`a;;))
Atree:
.((`x;1;)
  (`a;;))
Atree:
.((`x;0;)
  (`a;1;))
Atree:
.((`x;0;)
  (`a;;))
Btree:
.((`x;0;)
  (`a;0;))
Ctree:
.((`x;0;)
  (`a;;))
x=0
Btree:
.((`x;0;)
  (`a;0;))
Ctree:
.((`x;0;)
  (`a;1;))
Btree:
.((`x;1;)
  (`a;;))
Ctree:
.((`x;1;)
  (`a;;))
Btree:
()
Ctree:
.((`x;1;)
  (`a;;))
type error
>

The dictionary (i.e., tree) passed to the conversion/execution cycle gets restored after each cycle (as intended), but the "type error" persists.

tavmem commented 5 years ago

A further thought on the prior attempt:
Even if it had succeeded it would have been an inadequate solution.

The current (simplified) example:

 f:{a:x; {:[x=0; a:x; f(x-1)]}a; `0:("x=",$x),"\n"; a}; f 1

inbeds the recursion within a { } character function. The code changes to ~/kona/src/kx.c were made in a section that specifically handles this case:

   CS(3, //Executing a {} character function such as {1+1}, {x+y+z-1}, or {[a;b] a+b}
   ...

However, recursion may occur without a { } function, and in fact, the original problem described in this issue does not inbed the recursion in a { } function.

A more general solution is required.

My current guess is that the changes need to be made within wd(S s, int n, K*dict, K func) in src/p.c. wd( ) is called in the section handling a { } function, and would also be called anywhere else that sets up a conversion/execution cycle. The current wd( ) should set up an execution that is independent of any prior execution that recursively called the current wd( ).

tavmem commented 5 years ago

wd_( ) is recursive ... so, putting the changes there doesn't work. For example, just creating the function f by executing

f:{a:x; {:[x=0; a:x; f(x-1)]}a; a}

results in 15 separate executions of wd_(S s, int n, K*dict, K func)

  s: f:{a:x; {:[x=0; a:x; f(x-1)]}a; a}   n: 34
  dict: 6 6 0   
  func: 0

  s: a:x; {:[x=0; a:x; f(x-1)]}a; a}      n: 30
  dict: 1 5 0   .()
  func: 0

  s: :[x=0; a:x; f(x-1)]}a;a       n: 19
  dict: 1 5 0   .()
  func: 0

  s: x=0; a:x; f(x-1)]       n: 16
  dict: 1 5 0   .()
  func: 0

  s: x-1)                           n: 3
  dict: 1 5 3   
.((`x;;)
  (`a;;)
  (`f;;))
  func: 0

  s: :[x=0; a:x; f(x-1)]}a;a        n: 19
  dict: 1 5 0   .()
  func: 1 7 3   {:[x=0; a:x; f(x-1)]}

  s: x=0; a:x; f(x-1)]              n: 16
  dict: 1 5 0   .()
  func: 1 7 3   {:[x=0; a:x; f(x-1)]}

  s: x-1)                           n: 3
  dict: 1 5 1   
.,(`a;;)
  func: 1 7 3   {:[x=0; a:x; f(x-1)]}

  s: a:x; {:[x=0; a:x; f(x-1)]}a; a}   n: 30
  dict: 1 5 0   .()
  func: 1 7 3   {a:x; {:[x=0; a:x; f(x-1)]}a; a}

  s: :[x=0; a:x; f(x-1)]}a;a        n: 19
  dict: 1 5 0   .()
  func: 0

  s: x=0; a:x; f(x-1)]              n: 16
  dict: 1 5 0   .()
  func: 0

  s: x-1)                           n: 3
  dict: 1 5 3   
.((`x;;)
  (`a;;)
  (`f;;))
  func: 0

  s: :[x=0; a:x; f(x-1)]}a;a        n: 19
  dict: 1 5 0   .()
  func: 1 7 3   {:[x=0; a:x; f(x-1)]}

  s: x=0; a:x; f(x-1)]              n: 16
  dict: 1 5 0   .()
  func: 1 7 3   {:[x=0; a:x; f(x-1)]}

  s: x-1)                           n: 3
  dict: 1 5 1   
.,(`a;;)
  func: 1 7 3   {:[x=0; a:x; f(x-1)]}

which, at first look, seems excessive.

Anyway, I'm returning to a focus on where wd_( ) is initially called, everywhere else in the code base.

tavmem commented 5 years ago

Eliminating the recursive calls to wd( ) through capture( ) that are in src/p.c, there appear to be 2 places where wd( ) is called directly, and 5 places where it is called through wd( ).

src/kx.c:        kV(fc)[DEPTH]++; fw=wd_(kC(o),o->n,&tree,fc); kV(f)[CACHE_WD]=fw; cd(fc); }
src/v.c:    z=ex(wd_(kC(b),bn,&a,0)); }
src/c.c:    show(k=ex(wd(t,n)));
src/c.c:      show(k=ex(wd(w,l)));
src/k.c:  if(newS) {K r=ex(wd(newS,strlen(newS))); free(newS); R r;}
src/k.c:  else  R ex(wd(s,n));
src/kc.c:  RTIME(d,k=ex(wd(*a,*n)))
tavmem commented 5 years ago

This is also interesting: Suppose you use the "if" form of the conditional. Then the code in the section

CS(3, //Executing a {} character function such as {1+1}, {x+y+z-1}, or {[a;b] a+b}
... )

of src/kx.c is still executed. Using the same 3 modifications to src/kx.c that are listed 4 comments earlier, you get:

  f:{a:x; if[x>0; f(a-1)]; `0:("x=",$x),"\n"; a}
  f 1
Atree:
.((`x;1;)
  (`a;;))
Atree:
.((`x;0;)
  (`a;1;))
x=0
Btree:
.((`x;0;)
  (`a;0;))
Ctree:
.((`x;0;)
  (`a;1;))
Btree:
()
Ctree:
.((`x;1;)
  (`a;;))
type error

What this may imply is that fixing the code in this section of src/kx.c may be sufficient to fix the recursion problem, and we may not need to touch the other 6 instances listed in the previous comment.

bakul commented 5 years ago

My gut feeling is that a separate parsing step would simplify things. Such a parsing step would not require a dictionary. For example:

// input expression
// => parsed representation
a+2
=>      (+;`a;2)
+[a;2]
=>      (+;`a;2)
a:b+c
=>      (:;`a;(+;`b;`c))
a:x; if[x>0; f(a-1)]; a
=>      (";";  (:;`a;`x);  (`if; (>;`x;0); (`f;(-;`a;1)));  `a)

The parsed rep. can be directly interpreted and that is where you need a global and local dict. No idea how much work this change would be.

tavmem commented 5 years ago

I agree with your gut feeling. I think that such a change would be a huge amount of work, but ultimately should be done. Before attempting it, I plan to continue exploring the possibility of a workaround.

tavmem commented 5 years ago

What I've found so far ... and what looks like it might be a possible workaround: Using the following test case:

f:{a:x; if[x>100; f(a-12)]; a}
f 111

The first statement is parsed and executed. The parse creates a a type 7-0 object for execution:

0x7fb0d6e43018   1 7 0
a0:   .k
a1:
a2:   0x7fb0d6e6e6d8   1 -4 4   `"@��ְ^?" 0x3c `"�0�ְ^?" (nil)
.2a[0]: 0x7fb0d6e6e760 0x7fb0d6e6e040   0x7fb0d6e6e058   12 6 0
.2a[1]: 0x3c
.2a[2]: 0x7fb0d6e6e920 0x7fb0d6e43080   0x7fb0d6e43098   1 7 3   {a:x; if[x>100; f(a-12)]; a}
a3:   0x7fb0d6e6e618   1 5 1   .,(`;{a:x; if[x>100; f(a-12)]; a};)
0x7fb0d6e6e918   1 0 3   [`;{a:x; if[x>100; f(a-12)]; a};]
0x7fb0d6e6e058   12 6 0
0x7fb0d6e43098   1 7 3   {a:x; if[x>100; f(a-12)]; a}
0x7fb0d6e6e958 0x23df2a0  1 4 1   `
a4:   0x7fb0d6e6e658   1 5 0   .[]
a5:
a6:
a7:

The execution creates global "f' which is a type 7-3 object (a brace function):

0x7fb0d6e43098   3 7 3   {a:x; if[x>100; f(a-12)]; a}
a0:   .k
a1:
a2:   0x7fb0d6e6e898   1 -3 26   "a:x; if[x>100; f(a-12)]; a"
a3:   0x7fb0d6e6e818   1 5 1   .,(`a;;)
0x7fb0d6e6eb18   1 0 3   [`a;;]
0x7fb0d6e6e058   10 6 0
0x7fb0d6e6e058   10 6 0
0x7fb0d6e6ec18 0x23e14c0  1 4 1   `a
a4:   0x7fb0d6e6e858   1 5 1   .,(`x;;)
0x7fb0d6e6e998   1 0 3   [`x;;]
0x7fb0d6e6e058   10 6 0
0x7fb0d6e6e058   10 6 0
0x7fb0d6e6eb98 0x23df2c0  1 4 1   `x
a5:
a6:
a7:

The second statement causes "f" to be parsed and executed twice. The first parse creates a type 7-0 object which is to be executed with an argument of 111:

0x7fb0d6e43318   1 7 0
a0:   .k
a1:
a2:   0x7fb0d6e43218   1 -4 9   `"@��ְ^?" 0x3c `"���ְ^?" 0x1 `"�2�ְ^?" 0x1 `"@��ְ^?" (nil)  (nil)
.2a[0]: 0x7fb0d6e6eca0 0x7fb0d6e6e040   0x7fb0d6e6e058   18 6 0
.2a[1]: 0x3c
.2a[2]: 0x7fb0d6e6eea0 0x7fb0d6e6e6c0   0x7fb0d6e6e6d8   4 1 1   111
.2a[3]: 0x1
.2a[4]: 0x7fb0d6e6eda0 0x7fb0d6e43280   0x7fb0d6e43298   1 7 5
.2a[5]: 0x1
.2a[6]: 0x7fb0d6e6eca0 0x7fb0d6e6e040   0x7fb0d6e6e058   18 6 0
a3:   0x7fb0d6e6eb58   1 5 1   .,(`;;)
0x7fb0d6e6ed98   1 0 3   [`;;]
0x7fb0d6e6e058   18 6 0
0x7fb0d6e43298   1 7 5
0x7fb0d6e6ed58 0x23df2a0  1 4 1   `
a4:   0x7fb0d6e6ed18   1 5 0   .[]
a5:
a6:
a7:

That type 7-0 object contains a type 7-5 object (an if[] statement) containing "f". "f" is again parsed creating a type 7-0 object which is to be executed with an argument of 99:

 0x7fb0d6e43198            1 7 0
a0:   .k
a1:
a2:   0x7fb0d6e43498   1 -4 9   `"���ְ^?" 0x3c `"���ְ^?" 0x1 `"�5�ְ^?" 0x1 `"���ְ^?" (nil)  (nil)
.2a[0]: 0x7fb0d6e6eca0 0x7fb0d6e6e6c0   0x7fb0d6e6e6d8   4 1 1   111
.2a[1]: 0x3c
.2a[2]: 0x7fb0d6e6eea0 0x7fb0d6e6e8c0   0x7fb0d6e6e8d8   3 1 1   99
.2a[3]: 0x1
.2a[4]: 0x7fb0d6e41420 0x7fb0d6e43580   0x7fb0d6e43598   1 7 5
.2a[5]: 0x1
.2a[6]: 0x7fb0d6e6eca0 0x7fb0d6e6e6c0   0x7fb0d6e6e6d8   4 1 1   111
a3:   0x7fb0d6e41058   1 5 1   .,(`;;)
0x7fb0d6e41418   1 0 3   [`;;]
0x7fb0d6e6e058   21 6 0
0x7fb0d6e43598   1 7 5
0x7fb0d6e413d8 0x23df2a0  1 4 1   `
a4:   0x7fb0d6e41358   1 5 0   .[]
a5:
a6:
a7:

Note that there are 7 items in the [CODE] section of each type 7-0 object (labelled .2a[0] through .2a[6]. The problem is that items .2a[0] .2a[2] and .2a[6] have the same primary addresses in the result of each parse step. The 2nd parse result completes execution first, and that execution overwrites the corresponding items in the first parse result (which has not yet completed execution).

The possible workaround is to attempt to adjust the parse process so that each parse iteration of a brace function gets unique primary addresses for its [CODE] section items that refer to other objects.

tavmem commented 5 years ago

Based on the analysis above, it seems the code that needs to be changed in src/p.c is

else if((q=DE(*dict,u))) z=EVP(q); //If func has its local, use it

On the face of it, it seems we need to differentiate between items .2a[0] and .2a[6] which refer to a func local, and .2a[2] which refers to a func parameter (but is also a dict entry).

But on second thought, no. When execution returns to the first parse cycle, it must revert to the parameters of the first parse cycle.

tavmem commented 5 years ago
K* EVP(K e){R EIA(e,1);}    //dictionary entry's value-pointer address (K*)

The "problem" is not with EVP( ). The "problem" is with dict. In src/kx.c:

fw=wd_(kC(o),o->n,&tree,fc);

the address &tree is passed in the call to

K wd_(S s, int n, K*dict, K func)

so, the contents of the same tree is used in each recursive call. It now seems that each iteration of the recursion needs its own separate copy of tree as dict, which it can modify without affecting prior copies.

bakul commented 5 years ago

This was my guess. See my https://github.com/kevinlawler/kona/issues/521#issuecomment-391137893 comment. Note that all you have to do is increase ref count. If invoked function changes any argument or adds a local variable, that’s when a copy-on-write will be done. Analogous to the following example. Since most functions won’t modify any args or add a new local, this is a win.

 d:.+(`a`b;1 2)
  d
.((`a;1;)
  (`b;2;))
  e:d
  e
.((`a;1;)
  (`b;2;))
  e.a:4
  e
.((`a;4;)
  (`b;2;))
  d
.((`a;1;)
  (`b;2;))