kevinlawler / kona

Open-source implementation of the K programming language
ISC License
1.36k stars 138 forks source link

Unnecessry workaround #624

Closed tavmem closed 1 year ago

tavmem commented 2 years ago

This bug was identified by Douglas Menella.

Consider the following files:

$cat quote.csv
she,asked,"""why not now, steve?"""
"I",said,"""well judy, because """"I"""" am busy.
har har"""%

$cat dg4
qm:{s|((+\s:(x=y))!\:2)}
sp:{(,*l),1_/:1_ l:((0,&x) _ y)}
uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l)}
snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}
ncr:{x[&~("\r"=x)&1!"\n"=x]}
csv:{{uq["\""]x}''snq["\"";","]'snq["\"";"\n"]x}

$cat dg5
qm:{s|((+\s:(x=y))!\:2)}
sp:{(,*l),1_/:1_ l:((0,&x) _ y)}
uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l)}
snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}
ncr:{x[&~("\r"=x)&1!"\n"=x]}
csv:{uq["\""]''snq["\"";","]'snq["\"";"\n"]x}

Note: the only difference between dg4 and dg5 in in the definition of csv where the occurrence of {uq["\""]x} is replaced by uq["\""]

In Kona, dg4 works, but dg5 fails (in that it produces no results). In fact dg4 was devised by Douglas as a "workaround" for the failure in dg5.

$rlwrap -n ./k
kona      \ for help. \\ to exit.

  txt: 6:`quote.csv
  .:' 0: `dg4
  csv[ncr txt]
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")

  .:' 0: `dg5
  csv[ncr txt]

in k2.8 and k3.2, both dg4 and dg5 produce the same results. The "workaround" in dg4 is not necessary.

$rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems 
\ for help. \\ to exit.

  txt: 6:`quote.csv
  .:' 0: `dg4
(;;;;;)
  csv[ncr txt]
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")

  .:' 0: `dg5
(;;;;;)
  csv[ncr txt]
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")
tavmem commented 1 year ago

As a first step toward simplifying, and (hopefully) identifying the problem, in both k2.8 and in kona

  ^/txt = ncr txt
1.0

so, we can eliminate ncr We get the same result in kona

$ cat dg4a
qm:{s|((+\s:(x=y))!\:2)}
sp:{(,*l),1_/:1_ l:((0,&x) _ y)}
uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l)}
snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}
csv:{{uq["\""]x}''snq["\"";","]'snq["\"";"\n"]x}

$ cat dg5a
qm:{s|((+\s:(x=y))!\:2)}
sp:{(,*l),1_/:1_ l:((0,&x) _ y)}
uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l)}
snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}
csv:{uq["\""]''snq["\"";","]'snq["\"";"\n"]x}

  txt: 6:`quote.csv

  .:' 0: `dg4a
  csv txt
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")

  .:' 0: `dg5a
  csv txt
tavmem commented 1 year ago

To simplify further

  :t: snq["\"";","]'snq["\"";"\n"] txt
(("she"
  "asked"
  "\"\"\"why not now, steve?\"\"\"")
 ("\"I\""
  "said"
  "\"\"\"well judy, because \"\"\"\"I\"\"\"\" am busy.\nhar har\"\"\"%")
 ,"")

then in k2.8

  uq["\""]'' t
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")

but in kona

  uq["\""]'' t
tavmem commented 1 year ago

To simplify further, in k2.8

  uq:{ l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y; c:|/bs; (y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l) }

  v: (("aa"; "\"bb\""); ("cc"; "\"dd\""))

  uq["\""]'' v
(("aa"
  "bb")
 ("cc"
  "dd"))

in kona

  uq:{ l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y; c:|/bs; (y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l) }

  v: (("aa"; "\"bb\""); ("cc"; "\"dd\""))

  uq["\""]'' v
tavmem commented 1 year ago

This demonstrates that there is something particularly problematic in kona with the uq function. The problem is not generic to the form {func x}''arg in kona & in k2.8

  a:((1 2;3 4);(5 6;7 8))
  a
((1 2
  3 4)
 (5 6
  7 8))

  {+/x}''a
(3 7
 11 15)
tavmem commented 1 year ago

Demonstration of where uq fails

  uq:{ l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y; c:|/bs; (y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l) }

  v: (("aa"; "\"bb\""); ("cc"; "\"dd\""))

  uq["\""] v[0][0]
"aa"
  uq["\""] v[0][1]
"bb"
  uq["\""] v[1][0]
"cc"
  uq["\""] v[1][1]
"dd"

  uq["\""]' v[0]
("aa"
 "bb")
  uq["\""]' v[1]
("cc"
 "dd")

  uq["\""]'' v
tavmem commented 1 year ago

Adding a single statement to the each2 and eachpair2 functions in src/kx.c,

$ git diff
diff --git a/src/kx.c b/src/kx.c

 Z K each2(K a, V *p, K b)
-{ I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;
+{ O("bgn each2\n"); I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;

 Z K eachpair2(K a, V *p, K b)  //2==k necessary?
-{ V *o=p-1; K (*f)(K,K), k=0;
+{ O("bgn eachpair2\n"); V *o=p-1; K (*f)(K,K), k=0;

we find

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  uq:{ l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y; c:|/bs; (y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l) }
  v: (("aa"; "\"bb\""); ("cc"; "\"dd\""))

  uq["\""] v[0][0]
bgn eachpair2
"aa"

  uq["\""]' v[0]
bgn each2
bgn eachpair2
bgn eachpair2
("aa"
 "bb")

  uq["\""]'' v
bgn each2
bgn each2

and it looks like

tavmem commented 1 year ago

On the other hand ... in the simple case that does work ...

  a:((1 2;3 4);(5 6;7 8))
  {+/x}''a
bgn each2
bgn each2
bgn each2
(3 7
 11 15)

eachpair2 is never called.

But, since it's not needed ... that may be why it works.

tavmem commented 1 year ago

More clues ... adding 3 statements

$ git diff
diff --git a/src/kx.c b/src/kx.c

 Z K each2(K a, V *p, K b)
-{ I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;
+{ O("\neach2  "); I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;

 Z K eachpair2(K a, V *p, K b)  //2==k necessary?
-{ V *o=p-1; K (*f)(K,K), k=0;
+{ O("\neachpair2  "); V *o=p-1; K (*f)(K,K), k=0;

 K dv_ex(K a, V *p, K b)
-{ if(!p || !*p) R 0; //TODO: ???
+{ O("dv_ex  "); if(!p || !*p) R 0; //TODO: ???

we get

  uq:{ l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y; c:|/bs; (y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l) }
  v: (("aa"; "\"bb\""); ("cc"; "\"dd\""))

  uq["\""]' v[0]
dv_ex  
each2  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  
eachpair2  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  
eachpair2  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  dv_ex  ("aa"
 "bb")

  uq["\""]'' v
dv_ex  
each2  dv_ex  
each2  dv_ex

in the case of v[0]

in the case of v

tavmem commented 1 year ago

Identifying a simple function that fails: In k2.8:

  uq: { x=y }
  v: (("aa"; "\"bb\""); ("cc"; "\"dd\""))
  uq["\""]'' v
((0 0
  1 0 0 1)
 (0 0
  1 0 0 1))

In kona ... no result:

  uq: { x=y }
  v: (("aa"; "\"bb\""); ("cc"; "\"dd\""))
  uq["\""]'' v
tavmem commented 1 year ago

A variation using numbers: in k2.8:

  uq: { x=y }
  v: ((1 2; 9 3 4 9); (5 6; 9 7 8 9))
  uq[9]'' v
((0 0
  1 0 0 1)
 (0 0
  1 0 0 1))

In kona (no result):

  uq: { x=y }
  v: ((1 2; 9 3 4 9); (5 6; 9 7 8 9))
  uq[9]'' v
tavmem commented 1 year ago

Simplest yet: in k2.8:

  uq: { x=y }
  v: 9 3
  uq[9]'' v
1 0

in kona (no result):

  uq: { x=y }
  v: 9 3
  uq[9]'' v

However in both k2.8 and kona:

  uq: { x=9 }
  uq'' v
1 0
tavmem commented 1 year ago

Sorry ... there is yet an even simpler case: in kona:

  uq: { x=9 }
  v: ,3
  uq'' v
,0

  uq: { x=y }
  v: ,3
  uq[9]'' v

Then making the following code changes to display the inputs and result of dv_ex

 Z K each2(K a, V *p, K b)
-{ I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;
+{ O("each2\n"); I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;

 K dv_ex(K a, V *p, K b)
-{ if(!p || !*p) R 0; //TODO: ???
+{ O("  dv_ex   a:"); sd(a); O("          b:"); sd(b); O("          *p:    %p\n",*p); if(!p || !*p) R 0; //TODO: ???
+  if(*p>(V)DT_SIZE) {O("            "); sd(*(K*)*p);}
   ...
+  O("          r:"); sd(tmp);
   R tmp; }

Then in the case that works:

tavmem commented 1 year ago

However, note that in kona, this works:

  uq: { x=y }
  v: ,3 
  uq[9]' v
  dv_ex   a:     
          b:     0x7fefdb141c00 0x7fefdb141c18            2-6 -1 1   ,3
          *p:    0x9
each2
  dv_ex   a:     
          b:     0x7fefdb109080 0x7fefdb109098            1-6 1 1   3
          *p:    0x7ffc1be79c08
                 0x7fefdb10a000 0x7fefdb10a018            2-7 7 3   { x=y }[9;]
  dv_ex   a:     0x7fefdb141cc0 0x7fefdb141cd8            4-6 1 1   9
          b:     0x7fefdb109080 0x7fefdb109098            4-6 1 1   3
          *p:    0x2a
          r:     0x7fefdb109280 0x7fefdb109298            1-6 1 1   0
          r:     0x7fefdb109280 0x7fefdb109298            1-6 1 1   0
,0

So, we know what that "undecipherable" function should have been. This means that we know what went wrong. Now, we need to figure out why it went wrong, and how to fix it.

tavmem commented 1 year ago

We never did check the inputs to each2 In other words, did the problem first appear in a call to dv_ex or in a call to each2 Changing the first line of code in each2 displays the inputs

 Z K each2(K a, V *p, K b)
-{ I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;
+{ O("  each2   a:"); sd(a); O("          b:"); sd(b); O("          *p:    %p\n",*p); I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;

-{ if(!p || !*p) R 0; //TODO: ???
+{ O("  dv_ex   a:"); sd(a); O("          b:"); sd(b); O("          *p:    %p\n",*p); if(!p || !*p) R 0; //TODO: ???
+  if(*p>(V)DT_SIZE) {O("            "); sd(*(K*)*p);}
   ...
+  O("          r:"); sd(tmp);
   R tmp; }

and we get

  uq: { x=y }
  v: ,3

  uq[9]'' v
  dv_ex   a:     0x7f679dce9000 0x7f679dce9018            2-7 7 3   { x=y }[9;]
          b:     0x7f679dd20c00 0x7f679dd20c18            2-6 -1 1   ,3
          *p:    0x9
  each2   a:     0x7f679dce9000 0x7f679dce9018            2-7 7 3   { x=y }[9;]
          b:     0x7f679dd20c00 0x7f679dd20c18            2-6 -1 1   ,3
          *p:    0x9
  dv_ex   a:     
          b:     0x7f679dce8080 0x7f679dce8098            1-6 1 1   3
          *p:    0x9
  each2   a:     
          b:     0x7f679dce8080 0x7f679dce8098            1-6 1 1   3
          *p:    0x9
  dv_ex   a:     
          b:     0x7f679dce8080 0x7f679dce8098            1-6 1 1   3
          *p:    0x7f679dd20c60
                 0x7f679dce9180 0x7f679dce9198            1-7 7 0   
          r:

The problem first appears in a call to dv_ex However, that problematic call may have occurred in each2

tavmem commented 1 year ago

The problematic call to dv_ex does occur in each2

Z K each2(K a, V *p, K b)
{ I bt=b->t, bn=b->n; K prnt0=0, grnt0=0, d=0;
  if(bt > 0)
  { if(a && a->n>0)
    { K z = newK(0,a->n); U(z)
      DO(a->n, d = dv_ex(kK(a)[i],p-1,b); M(d,z) kK(z)[i]=d)
      z=demote(z);
      if(z->t==1) z->t=-1;
      R z; }
    else { d=dv_ex(a,p-1,b); R d; } }                     // It occurs in this line

Remember, that the 2nd call to each2 gets 3 inputs

Apparently, for a simple function with 1 argument, like the case

p-1 provides the address (*(K*)*p-1) of the function and dv_ex gets { x=9 } as an argument. For a more complex function with multiple arguments, like the case

p-1 does not provide the correct address of the function, and the call to dv_ex has a bogus argument. It is not immediately clear how to best fix this, but that's the next step.

tavmem commented 1 year ago

After commit of Sep 23, 2022 (Fix part of issue 624), the following now works (in Kona and in k2.8):

  uq: { x=y }
  v: ,3
  uq[9]'' v
,0

However, this still gives no response (in Kona)

  txt: 6:`quote.csv
  .:' 0: `dg5
  csv[ncr txt]
tavmem commented 1 year ago

As a next step, let's try a slightly different approach. First simplify the input by using a cut down file

- cat q.csv
s,"?"

Then in k2.8

  txt: 6:`q.csv
  .:' 0: `dg4
(;;;;;)

  :v:ncr txt
"s,\"?\"\n"

  :z:{snq["\"";","]'snq["\"";"\n"]x} v
((,"s"
  "\"?\"")
 ,"")

  uq["\""]'' z
((,"s"
  ,"?")
 ,"")

  {x=y}["\""]'' z
((,0
  1 0 1)
 ,!0)

In kona

  txt: 6:`q.csv
  .:' 0: `dg4

  :v:ncr txt
"s,\"?\"\n"

  :z:{snq["\"";","]'snq["\"";"\n"]x} v
((,"s"
  "\"?\"")
 ,"")

  uq["\""]'' z

  {x=y}["\""]'' z

and, we have our next simple case that works in k2.8 but not in kona

BTW, in kona (and in k2.8) this works

  {x="\""}'' z
((,0
  1 0 1)
 ,!0)