Alternate read problem in Linux

tavmem commented 1 year ago

Under Ubuntu and Fedora:

$ ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5);1: "file"
("a"
 4 5)

   "file" 1: ("a"; 4 5);1: "file"
()

   "file" 1: ("a"; 4 5);1: "file"
("a"
 4 5)

   "file" 1: ("a"; 4 5);1: "file"
()

Under FreeBSD:

$ ./k
kona      \ for help. \\ to exit.

   "file" 1: ("a"; 4 5);1: "file"
("a"
 4 5)

   "file" 1: ("a"; 4 5);1: "file"
("a"
 4 5)

   "file" 1: ("a"; 4 5);1: "file"
("a"
 4 5)

   "file" 1: ("a"; 4 5);1: "file"
("a"
 4 5)

tavmem commented 1 year ago

This is interesting (maybe not all that surprising). I was interested to see if this has always been a problem. If I go back to the version of Kona that existed in Jan 15, 2013

$ cd k130115
$ rlwrap -n ./k
K Console - Enter \ for help

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
Segmentation fault (core dumped)

bakul commented 1 year ago

This works correctly on MacOS.

On Ubuntu I noticed that if you rm file.K after each invocation of "file" 1: ("a"; 4 5); 1: "file" it works correctly. cmp -l of the correct and incorrect file.K shows many zeroes on the incorrect one. But if you restart k after just one invocation of the test, it works correctly, so this is a kona problem. I'd suggest single stepping through _1d_write() and in particular check if arg y has the correct K object.

tavmem commented 1 year ago

Wow ... this is a strange issue. After making this single line change (under Fedora):

$ git diff
diff --git a/src/0.c b/src/0.c
index 24a2753..f67db29 100644
--- a/src/0.c
+++ b/src/0.c
@@ -568,6 +568,7 @@ K _1d(K x,K y) {

 //TODO: for testing this, use 1:write and 2:read (or 1:read) to confim items are the same before write & after read
 Z K _1d_write(K x,K y,I dosync) {
+  O("_1d_write    dosync: %lld    x:",dosync);sd(x); O("                          y:");sd(y);
   //Note: all file objects must be at least 4*sizeof(I) bytes...fixes bugs in K3.2, too
   //K3.2 Bug - "a"1:`a;2:"a" or 1:"a" - wsfull, tries to read sym but didn't write enough bytes?
   I n=disk(y);
$

i get correct results (again) every time

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
_1d_write    dosync: 0    x:     0x7f994063b6c0 0x7f994063b6d8            3-6 -3 4   "file"
                          y:     0x7f994063b680 0x7f994063b698            2-6 0 2   
("a"
 4 5)
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
_1d_write    dosync: 0    x:     0x7f994063b640 0x7f994063b658            3-6 -3 4   "file"
                          y:     0x7f994063b600 0x7f994063b618            2-6 0 2   
("a"
 4 5)
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
_1d_write    dosync: 0    x:     0x7f994063bb40 0x7f994063bb58            3-6 -3 4   "file"
                          y:     0x7f994063b600 0x7f994063b618            2-6 0 2   
("a"
 4 5)
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
_1d_write    dosync: 0    x:     0x7f994063ba40 0x7f994063ba58            3-6 -3 4   "file"
                          y:     0x7f994063b600 0x7f994063b618            2-6 0 2   
("a"
 4 5)
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
_1d_write    dosync: 0    x:     0x7f994063bf40 0x7f994063bf58            3-6 -3 4   "file"
                          y:     0x7f994063b600 0x7f994063b618            2-6 0 2   
("a"
 4 5)
("a"
 4 5)

I'm going to restart my maching (again) and see if the errors recur.

tavmem commented 1 year ago

Restarted my machine. Still got consistently correct results if I leave the new line intact. Reverts to the alternate read problem if I comment out the new line.

tavmem commented 1 year ago

After calling mmap() and closing the file, 1d_write calls wrep() Making this single line addition

$ git diff
diff --git a/src/0.c b/src/0.c
index 24a2753..60817b2 100644
--- a/src/0.c
+++ b/src/0.c
@@ -599,6 +599,7 @@ Z K _1d_write(K x,K y,I dosync) {
 }

 I wrep(K x,V v,I y) {   //write representation. see rep(). y in {0,1}->{net, disk}
+  O("wrep    y: %lld    v: %p    x:",y,v);sd(x);
   I t=xt, n=xn;
   I* w=(I*)v;
$

also yields consistently correct results

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fce07a55000    x:     0x7fce07a8e680 0x7fce07a8e698            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7fce07a55020    x:     0x7fce07a8e880 0x7fce07a8e898            3-6 3 1   "a"
wrep    y: 1    v: 0x7fce07a55040    x:     0x7fce07a8e940 0x7fce07a8e958            3-6 -1 2   4 5
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fce07a55000    x:     0x7fce07a8e600 0x7fce07a8e618            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7fce07a55020    x:     0x7fce07a8e7c0 0x7fce07a8e7d8            3-6 3 1   "a"
wrep    y: 1    v: 0x7fce07a55040    x:     0x7fce07a8e900 0x7fce07a8e918            3-6 -1 2   4 5
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fce07a55000    x:     0x7fce07a8e600 0x7fce07a8e618            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7fce07a55020    x:     0x7fce07a8ec80 0x7fce07a8ec98            3-6 3 1   "a"
wrep    y: 1    v: 0x7fce07a55040    x:     0x7fce07a8ed40 0x7fce07a8ed58            3-6 -1 2   4 5
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fce07a51000    x:     0x7fce07a8e600 0x7fce07a8e618            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7fce07a51020    x:     0x7fce07a550c0 0x7fce07a550d8            3-6 3 1   "a"
wrep    y: 1    v: 0x7fce07a51040    x:     0x7fce07a55180 0x7fce07a55198            3-6 -1 2   4 5
("a"
 4 5)

tavmem commented 1 year ago

I tried it a couple of more times. Now I'm getting errors

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fe04b689000    x:     0x7fe04b6c2680 0x7fe04b6c2698            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7fe04b689020    x:     0x7fe04b6c2880 0x7fe04b6c2898            3-6 3 1   "a"
wrep    y: 1    v: 0x7fe04b689040    x:     0x7fe04b6c2940 0x7fe04b6c2958            3-6 -1 2   4 5
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fe04b689000    x:     0x7fe04b688048 0x7fe04b688060            0-0 0 0   
()
()
  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fe04b689000    x:     0x7fe04b6c2b00 0x7fe04b6c2b18            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7fe04b689020    x:     0x7fe04b6c2780 0x7fe04b6c2798            3-6 3 1   "a"
wrep    y: 1    v: 0x7fe04b689040    x:     0x7fe04b6c2980 0x7fe04b6c2998            3-6 -1 2   4 5
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
wrep    y: 1    v: 0x7fe04b689000    x:     0x7fe04b687048 0x7fe04b687060            0-0 0 0   
()
()

tavmem commented 1 year ago

More interesting behavior. If you separate the inputs with a blank input line, the problem disappears:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
()
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
()

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

BTW: That's probably why the problem seemed to disappear for me earlier (in prior comments). The problem also disappears if you do some other command in between (instead of a blank line).

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  123
123
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  123
123
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  123
123
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

tavmem commented 1 year ago

So, using NO blank lines, and just "trackers" on the inputs to_1d_write() and wrep() we get

$ git diff
diff --git a/src/0.c b/src/0.c
index 24a2753..e204090 100644
--- a/src/0.c
+++ b/src/0.c
@@ -568,6 +568,7 @@ K _1d(K x,K y) {

 //TODO: for testing this, use 1:write and 2:read (or 1:read) to confim items are the same before write & after read
 Z K _1d_write(K x,K y,I dosync) {
+  O("_1d_write    dosync: %lld   x:",dosync);sd(x); O("                         y:");sd(y);
   //Note: all file objects must be at least 4*sizeof(I) bytes...fixes bugs in K3.2, too
   //K3.2 Bug - "a"1:`a;2:"a" or 1:"a" - wsfull, tries to read sym but didn't write enough bytes?
   I n=disk(y);
@@ -599,6 +600,7 @@ Z K _1d_write(K x,K y,I dosync) {
 }

 I wrep(K x,V v,I y) {   //write representation. see rep(). y in {0,1}->{net, disk}
+  O("wrep    y: %lld    v: %p    x:",y,v);sd(x);
   I t=xt, n=xn;
   I* w=(I*)v;
$

these results:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
_1d_write    dosync: 0   x:     0x7f7941aa56c0 0x7f7941aa56d8            3-6 -3 4   "file"
                         y:     0x7f7941aa5680 0x7f7941aa5698            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7f7941a6c000    x:     0x7f7941aa5680 0x7f7941aa5698            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7f7941a6c020    x:     0x7f7941aa5880 0x7f7941aa5898            3-6 3 1   "a"
wrep    y: 1    v: 0x7f7941a6c040    x:     0x7f7941aa5940 0x7f7941aa5958            3-6 -1 2   4 5
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
_1d_write    dosync: 0   x:     0x7f7941aa5640 0x7f7941aa5658            3-6 -3 4   "file"
                         y:     0x7f7941a6b048 0x7f7941a6b060            2-6 0 2   
("a"
 4 5)
wrep    y: 1    v: 0x7f7941a6c000    x:     0x7f7941a6b048 0x7f7941a6b060            0-0 0 0   
()
()

On the 2nd execution:

the inputs to _1d_write() are OK.
the inputs to wrep() are bad. T

bakul commented 1 year ago

With gdb I see that the *y becomes inaccessible right after the open in line src/0.c:583 for the error case! See the last line in included gdb trace.

$ gdb ./k
...
(gdb) b src/0.c:583
Breakpoint 1 at 0x7606: file src/0.c, line 583.
(gdb) run
Starting program: /home/bakul/lang/kona/k
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
kona      \ for help. \\ to exit.

   "file" 1: ("a"; 4 5); 1: "file"

Breakpoint 1, _1d_write (x=<optimized out>, y=0x7ffff7ffa680, dosync=0) at src/0.c:583
583       I f=open(e,O_RDWR|O_CREAT|O_TRUNC,0777);
(gdb) p *y
$1 = {_c = 518, t = 0, n = 2, k = {0x7ffff7ffa880}}
(gdb) n
584       free(e);
(gdb) p *y
$2 = {_c = 518, t = 0, n = 2, k = {0x7ffff7ffa880}}
(gdb) c
Continuing.
("a"
 4 5)
   "file" 1: ("a"; 4 5); 1: "file"

Breakpoint 1, _1d_write (x=<optimized out>, y=0x7ffff7fb7048, dosync=0) at src/0.c:583
583       I f=open(e,O_RDWR|O_CREAT|O_TRUNC,0777);
(gdb) p *y
$3 = {_c = 518, t = 0, n = 2, k = {0x7ffff7ffa780}}
(gdb) n
584       free(e);
(gdb) p *y
Cannot access memory at address 0x7ffff7fb7048

tavmem commented 1 year ago

Thanks !!! I had come to the same conclusion while on a plane from Aruba today

wiith these changes

$ git diff
diff --git a/src/0.c b/src/0.c
index 24a2753..f7cea3c 100644
--- a/src/0.c
+++ b/src/0.c
@@ -580,7 +580,9 @@ Z K _1d_write(K x,K y,I dosync) {
   U(e)

   //Largely copy-pasted from 6:dyadic
+  O("AAA    dosync: %lld   x:",dosync);sd(x); O("                          y:");sd(y);
   I f=open(e,O_RDWR|O_CREAT|O_TRUNC,07777);
+  O("BBB    dosync: %lld   x:",dosync);sd(x); O("                          y:");sd(y);
   free(e);
   P(f<0,SE)

the result is

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
AAA    dosync: 0   x:     0x7f3c4652c6c0 0x7f3c4652c6d8            3-6 -3 4   "file"
                   y:     0x7f3c4652c680 0x7f3c4652c698            2-6 0 2   
("a"
 4 5)
BBB    dosync: 0   x:     0x7f3c4652c6c0 0x7f3c4652c6d8            3-6 -3 4   "file"
                   y:     0x7f3c4652c680 0x7f3c4652c698            2-6 0 2   
("a"
 4 5)
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
AAA    dosync: 0   x:     0x7f3c4652c640 0x7f3c4652c658            3-6 -3 4   "file"
                   y:     0x7f3c464f2048 0x7f3c464f2060            2-6 0 2   
("a"
 4 5)
BBB    dosync: 0   x:     0x7f3c4652c640 0x7f3c4652c658            3-6 -3 4   "file"
Bus error (core dumped)
$

I haven't yet checked out what Bus error (core dumped) means.

tavmem commented 1 year ago

Here's more interesting behaviors. Although we have demonstrated that the problem manifests in _1d_write() on src/0.c:583

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5)
  \\
$ xxd file.K
00000000: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
$

if you put a semicolon at the end of the second input line, the problem does not occur

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5);
  \\
$ xxd file.K
00000000: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0200 0000 0000 0000  ................
00000020: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000030: 0300 0000 0000 0000 6100 0000 0000 0000  ........a.......
00000040: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000050: ffff ffff ffff ffff 0200 0000 0000 0000  ................
00000060: 0400 0000 0000 0000 0500 0000 0000 0000  ................
$

and ... also, you get no problem in this case (with a semicolon at the end of the first input line)

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

but then, you do get the problem on the third input line

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file"
()

Seems a bit strange. The problem seems to be dependent upon whether you display a result.

tavmem commented 1 year ago

This is from stackoverflow

Bus errors are rare nowadays on x86 and occur when your processor cannot even attempt 
the memory access requested, typically:

    using a processor instruction with an address that does not satisfy its alignment requirements.

Segmentation faults occur when accessing memory which does not belong to your process. 
They are very common and are typically the result of:

    using a pointer to something that was deallocated.
    using an uninitialized hence bogus pointer.
    using a null pointer.
    overflowing a buffer.

PS: To be more precise, it is not manipulating the pointer itself that will cause issues. 
It's accessing the memory it points to (dereferencing).

It might be a memory alignment issue. But why would it be dependent on a semicolon?

bakul commented 1 year ago

*y can be accessed before the open but not after so it is not an alignment issue. It is very strange.... Is there another thread running that may be interfering?

tavmem commented 1 year ago

There is no other thread that I am aware of.

*y is "mmaped" to a file in 1st input line It appears as if the process of displaying the value (in 1st input line) creates (or leaves) some vestige of that "mmap". When the 2nd input line attempts to open the file anew , *y is no longer acccessible. However, the "display" and the "open" are in 2 separate input lines.

Note that if we use 2 different files, there is no problem.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)
  "file1" 1: ("a"; 4 5); 1: "file1"
("a"
 4 5)

Also the "display" of some other value (even a blank input line) eliminates the problem

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

And, when we suppress the display, there is no problem

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file"
("a"
 4 5)

Two possible solutions:

eliminate mmap altogether in this situation,
discover and fix the vestige left by the display process.

tavmem commented 1 year ago

A new wrinkle: It appeared that there was no problem if there was no display ..... NOT TRUE

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file";
Segmentation fault (core dumped)
$

Furthermore, the segfault is NOT happening in the same spot, i.e., not at the open The value is still accessible immediately after the open

kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
AAA(before open)   dosync: 0   x:     0x7f64411996c0 0x7f64411996d8            3-6 -3 4   "file"
                               y:     0x7f6441199680 0x7f6441199698            2-6 0 2   
("a"
 4 5)
BBB(after open)    dosync: 0   x:     0x7f64411996c0 0x7f64411996d8            3-6 -3 4   "file"
                               y:     0x7f6441199680 0x7f6441199698            2-6 0 2   
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file";
AAA(before open)   dosync: 0   x:     0x7f6441199740 0x7f6441199758            3-6 -3 4   "file"
                               y:     0x7f6441199700 0x7f6441199718            2-6 0 2   
("a"
 4 5)
BBB(after open)    dosync: 0   x:     0x7f6441199740 0x7f6441199758            3-6 -3 4   "file"
                               y:     0x7f6441199700 0x7f6441199718            2-6 0 2   
("a"
 4 5)
  "file" 1: ("a"; 4 5); 1: "file";
AAA(before open)   dosync: 0   x:     0x7f6441199800 0x7f6441199818            3-6 -3 4   "file"
                               y:     0x7f64411996c0 0x7f64411996d8            2-6 0 2   
("a"
 4 5)
BBB(after open)    dosync: 0   x:     0x7f6441199800 0x7f6441199818            3-6 -3 4   "file"
                               y:     0x7f64411996c0 0x7f64411996d8            2-6 0 2   
("a"
 4 5)
Segmentation fault (core dumped)
$

The above was generated with these 2 code changes

$ git diff
diff --git a/src/0.c b/src/0.c
index 24a2753..5d63c5b 100644
--- a/src/0.c
+++ b/src/0.c
@@ -580,7 +580,9 @@ Z K _1d_write(K x,K y,I dosync) {
   U(e)

   //Largely copy-pasted from 6:dyadic
+  O("AAA(before open)   dosync: %lld   x:",dosync);sd(x); O("                               y:");sd(y);
   I f=open(e,O_RDWR|O_CREAT|O_TRUNC,07777);
+  O("BBB(after open)    dosync: %lld   x:",dosync);sd(x); O("                               y:");sd(y);
   free(e);
   P(f<0,SE)
$

tavmem commented 1 year ago

The new segfault occurs in unpool() which is in src/km.c

$ git diff
diff --git a/src/km.c b/src/km.c
index 8496150..2299970 100644
--- a/src/km.c
+++ b/src/km.c
@@ -183,7 +183,9 @@ Z V unpool(I r)
     }//Low lanes subdivide pages. no divide op
     *L=z;
   }
-  z=*L;*L=*z;*z=0;
+  z=*L;
+  if(z<10)O("unpool() in src/km.c   z: %p\n",z);
+  *L=*z;*z=0;
   mUsed+=k; if(mUsed>mMax)mMax=mUsed;
   R z;
 }
$

z=*L should be an address pointer We get the segfault on trying to access *z when z becomes 0x1;

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file";
unpool() in src/km.c   z: 0x1
Segmentation fault (core dumped)

tavmem commented 1 year ago

By making this change

$ git diff
diff --git a/src/km.c b/src/km.c
index 8496150..37dcb83 100644
--- a/src/km.c
+++ b/src/km.c
@@ -173,7 +173,7 @@ Z V unpool(I r)
   V*z;
   V*L=((V*)KP)+r;
   I k= ((I)1)<<r;
-  if(!*L || (V)0x106==*L)
+  if((V)0x200>*L)
   {
     U(z=amem(k,r))
     if(k<PG)
$

We can fix that particular segfault, and get a bit further

$ rlwrap -n  ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file";
  "file" 1: ("a"; 4 5); 1: "file";
Segmentation fault (core dumped)
$

but then, we get a segfault somewhere else.

In any case, I will commit this change, which generalizes the test for adding memory.

tavmem commented 1 year ago

BTW: by making this change

$ git diff
diff --git a/src/0.c b/src/0.c
index 24a2753..5be0011 100644
--- a/src/0.c
+++ b/src/0.c
@@ -509,6 +509,7 @@ K _1m(K x) {    //Keeps binary files mapped
   K z = _1m_r(f,v,v,v+s,&b);
   r=close(f); if(r)R FE;
   r=munmap(v,s); if(r)R UE;
+  O("mUsed: %f\n",mUsed);
   R z;
 }
$

we see wthat appaers to be a memory leak. (If the tracking is correct.)

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
mUsed: 3472.000000
  "file" 1: ("a"; 4 5); 1: "file";
mUsed: 3552.000000
  "file" 1: ("a"; 4 5); 1: "file";
mUsed: 3757.000000
  "file" 1: ("a"; 4 5); 1: "file";
Segmentation fault (core dumped)
$

That should probably be treated as a separate issue.

bakul commented 1 year ago

*y is "mmaped" to a file in 1st input line

That would be strange. *y is checked during "file" 1: ("a"; 4 5) so *y is ("a"; 4 5), which is evaluated, not read from any file.

tavmem commented 1 year ago

Yes, I agree that ("a"; 4 5) is evaluated, not read FROM any file.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5)
  \\
$ xxd file.K
00000000: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0200 0000 0000 0000  ................
00000020: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000030: 0300 0000 0000 0000 6100 0000 0000 0000  ........a.......
00000040: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000050: ffff ffff ffff ffff 0200 0000 0000 0000  ................
00000060: 0400 0000 0000 0000 0500 0000 0000 0000  ................
$

The first input line causes kona to

evaluate ("a"; 4 5)
use mmap in _1d_write()to write it to the file file.K

That's what I meant when I said

*y is "mmaped" to a file in 1st input line

not from a file

tavmem commented 1 year ago

This is very weird ... I was about to do some analysis to find out where the 0x5 came from ... but the segfault has disappeared ... don't know why.


$ uname -mrs
Linux 5.19.16-200.fc36.x86_64 x86_64
$ 
$ pwd
/home/tom/kona
$ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
$ rlwrap  -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  "file" 1: ("a"; 4 5);
  \\
$ rlwrap  -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  "file" 1: ("a"; 4 5)
  \\
$ 
$ ./k_test
t:0
t:50
t:100
t:150
t:200
t:250
t:300
t:350
t:400
t:450
t:500
t:550
t:600
t:650
t:700
t:750
t:800
t:850
t:900
t:950
t:1000
t:1050
t:1100
Test pass rate: 1.0000, Total: 1134, Passed: 1101, Skipped: 33, Failed: 0, Time: 0.356122s
OK
kona      \ for help. \\ to exit.

``
I restarted my machine ... same result.
I'll try again tomorrow.

Does the problem exist on Ubuntu?

bakul commented 1 year ago

Run "file" 1: ("a"; 4 5); 1: "file"; not just "file" 1: ("a"; 4 5);.

tavmem commented 1 year ago

Thanks !! I did figure that out, and tried to delete my comment saying the problem disappeared. I ended up deleting the wrong comment (the one showing why we get the segfault). I will try to restore it.

tavmem commented 1 year ago

Where the segfault occurs ... Make this change

$ git diff
diff --git a/src/kx.c b/src/kx.c
index ffa44da..a730e8a 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -599,6 +599,7 @@ K vf_ex(V q, K g)
 Z V ex_(V a, I r)   //Expand wd()->7-0 types, expand and evaluate brackets.   Could probably fold ex0 into ex_
 { K x,y=0,z,tmp;
   if(VA(a)) R a;
+  O("*(V*)a: %p\n",*(V*)a);
   if(!(x=*(K*)a) || 7!=xt || (0<xn && xn<4)) R ci(x); //assert xn>=4 -> conditionals or similar
   r=xn<4?r:xn;   //suggests maybe r should be stored on 7type itself
   if(kV(x)[CONJ])
$

We get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7f51f91d4000
*(V*)a: 0x7f51f91d4180
*(V*)a: 0x7f51f920c940
*(V*)a: 0x7f51f920c880
*(V*)a: 0x7f51f920c6c0
*(V*)a: 0x7f51f920ca40
  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7f51f91d4000
*(V*)a: 0x7f51f91d4100
*(V*)a: 0x7f51f920c9c0
*(V*)a: 0x7f51f920c900
*(V*)a: 0x7f51f920c740
*(V*)a: 0x7f51f920cac0
  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7f51f91d4000
*(V*)a: 0x7f51f91d4080
*(V*)a: 0x7f51f920c840
*(V*)a: 0x7f51f920c980
*(V*)a: 0x7f51f920c800
*(V*)a: 0x7f51f920cb00
  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7f51f91d4000
*(V*)a: 0x7f51f91d4180
*(V*)a: 0x7f51f920cac0
*(V*)a: 0x7f51f920c9c0
*(V*)a: 0x7f51f920c780
*(V*)a: 0x5
Segmentation fault (core dumped)
$

The segfault occurs in the next line

if(!(x=*(K*)a) || 7!=xt || (0<xn && xn<4)) R ci(x); //assert xn>=4 -> conditionals or similar

when kona attempts to treat 0x5 as a K-structure.

tavmem commented 1 year ago

The K-structure related to each the addresses listed, ... from which we can infer what the last K-structure should have been.

Make this change:

$ git diff
diff --git a/src/kx.c b/src/kx.c
index ffa44da..486c10a 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -599,6 +599,7 @@ K vf_ex(V q, K g)
 Z V ex_(V a, I r)   //Expand wd()->7-0 types, expand and evaluate brackets.   Could probably fold ex0 into ex_
 { K x,y=0,z,tmp;
   if(VA(a)) R a;
+  O("*(V*)a: %p\n",*(V*)a); sd_(*(K*)a,1);
   if(!(x=*(K*)a) || 7!=xt || (0<xn && xn<4)) R ci(x); //assert xn>=4 -> conditionals or similar
   r=xn<4?r:xn;   //suggests maybe r should be stored on 7type itself
   if(kV(x)[CONJ])
$

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7ff676b40000
     0x7ff676b40000 0x7ff676b40018            1-7 7 0   
*(V*)a: 0x7ff676b40180
     0x7ff676b40180 0x7ff676b40198            1-7 7 0   
*(V*)a: 0x7ff676b78940
     0x7ff676b78940 0x7ff676b78958            1-6 -1 2   4 5
*(V*)a: 0x7ff676b78880
     0x7ff676b78880 0x7ff676b78898            1-6 3 1   "a"
*(V*)a: 0x7ff676b786c0
     0x7ff676b786c0 0x7ff676b786d8            1-6 -3 4   "file"
*(V*)a: 0x7ff676b78a40
     0x7ff676b78a40 0x7ff676b78a58            1-6 -3 4   "file"
  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7ff676b40000
     0x7ff676b40000 0x7ff676b40018            1-7 7 0   
*(V*)a: 0x7ff676b40100
     0x7ff676b40100 0x7ff676b40118            1-7 7 0   
*(V*)a: 0x7ff676b789c0
     0x7ff676b789c0 0x7ff676b789d8            1-6 -1 2   4 5
*(V*)a: 0x7ff676b78900
     0x7ff676b78900 0x7ff676b78918            1-6 3 1   "a"
*(V*)a: 0x7ff676b78740
     0x7ff676b78740 0x7ff676b78758            1-6 -3 4   "file"
*(V*)a: 0x7ff676b78ac0
     0x7ff676b78ac0 0x7ff676b78ad8            1-6 -3 4   "file"
  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7ff676b40000
     0x7ff676b40000 0x7ff676b40018            1-7 7 0   
*(V*)a: 0x7ff676b40080
     0x7ff676b40080 0x7ff676b40098            1-7 7 0   
*(V*)a: 0x7ff676b78840
     0x7ff676b78840 0x7ff676b78858            1-6 -1 2   4 5
*(V*)a: 0x7ff676b78980
     0x7ff676b78980 0x7ff676b78998            1-6 3 1   "a"
*(V*)a: 0x7ff676b78800
     0x7ff676b78800 0x7ff676b78818            1-6 -3 4   "file"
*(V*)a: 0x7ff676b78b00
     0x7ff676b78b00 0x7ff676b78b18            1-6 -3 4   "file"
  "file" 1: ("a"; 4 5); 1: "file";
*(V*)a: 0x7ff676b40000
     0x7ff676b40000 0x7ff676b40018            1-7 7 0   
*(V*)a: 0x7ff676b40180
     0x7ff676b40180 0x7ff676b40198            1-7 7 0   
*(V*)a: 0x7ff676b78ac0
     0x7ff676b78ac0 0x7ff676b78ad8            1-6 -1 2   4 5
*(V*)a: 0x7ff676b789c0
     0x7ff676b789c0 0x7ff676b789d8            1-6 3 1   "a"
*(V*)a: 0x7ff676b78780
     0x7ff676b78780 0x7ff676b78798            1-6 -3 4   "file"
*(V*)a: 0x5
Segmentation fault (core dumped)

tavmem commented 1 year ago

To get a better understanding of what happened, add this line.

$ git diff
diff --git a/src/kx.c b/src/kx.c
index ffa44da..95eb2e8 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -914,6 +914,7 @@ Z K ex2(V*v, K k)  //execute words --- all returns must be Ks. v: word list, k:
     R e; }
   //vn. case
   i=0;
+  I ii; for(ii=0;v[ii];ii++) { O("     ex2 v[%lld]: %p",ii,v[ii]); if(v[ii]>(V)DT_SIZE)sd(*(K*)v[ii]); else O("\n"); }
   while(adverbClass(v[1+i])) i++; //ALT'Y: i=adverbClass(b)?i+1:0;
   t2=ex2(v+1+i,k); //oom. these cannot be placed into single function call b/c order of eval is unspecified
   t3=ex_(*v,1); if(t3<(K)DT_SIZE)ft3=1;
$

Note that the test is simply whether v[ii] exists.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
     ex2 v[0]: 0x3f
     ex2 v[1]: 0x7fecf7f74aa0     0x7fecf7f74a40 0x7fecf7f74a58            1-6 -3 4   "file"
     ex2 v[2]: 0x1
  "file" 1: ("a"; 4 5); 1: "file";
     ex2 v[0]: 0x3f
     ex2 v[1]: 0x7fecf7f74a60     0x7fecf7f74ac0 0x7fecf7f74ad8            1-6 -3 4   "file"
     ex2 v[2]: 0x1
  "file" 1: ("a"; 4 5); 1: "file";
     ex2 v[0]: 0x3f
     ex2 v[1]: 0x7fecf7f74ae0     0x7fecf7f74b00 0x7fecf7f74b18            1-6 -3 4   "file"
     ex2 v[2]: 0x1
  "file" 1: ("a"; 4 5); 1: "file";
     ex2 v[0]: 0x3f
Segmentation fault (core dumped)
$

tavmem commented 1 year ago

My last comment was erroneous:

v[1] does exist in the 4th execution
it simply contains the number 0x5, instead of a K-structure

If we make these code changes

$ git diff
diff --git a/src/kx.c b/src/kx.c
index ffa44da..4e1ca73 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -914,7 +914,11 @@ Z K ex2(V*v, K k)  //execute words --- all returns must be Ks. v: word list, k:
     R e; }
   //vn. case
   i=0;
+  I ii; for(ii=0;ii<3;ii++)O("     test1 &v[%lld]: %p   v[%lld]: %p\n",ii,&v[ii],ii,v[ii]);
+  O("                              *(V*)v[1]: %p\n",*(V*)v[1]); O("\n");
   while(adverbClass(v[1+i])) i++; //ALT'Y: i=adverbClass(b)?i+1:0;
+  for(ii=0;v[ii];ii++) { O("     test2 &v[%lld]: %p   v[%lld]: %p",ii,&v[ii],ii,v[ii]);
+                         if(v[ii]>(V)DT_SIZE)sd(*(K*)v[ii]); else O("\n"); }; O("\n");
   t2=ex2(v+1+i,k); //oom. these cannot be placed into single function call b/c order of eval is unspecified
   t3=ex_(*v,1); if(t3<(K)DT_SIZE)ft3=1;
   if(t3>(K)DT_SIZE && t3->t==7 && t3->n==3)
$

we create 2 tests that demonstrate this

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
     test1 &v[0]: 0x7f3416da70b8   v[0]: 0x3f
     test1 &v[1]: 0x7f3416da70c0   v[1]: 0x7f3416ddfaa0
     test1 &v[2]: 0x7f3416da70c8   v[2]: 0x1
                              *(V*)v[1]: 0x7f3416ddfa40

     test2 &v[0]: 0x7f3416da70b8   v[0]: 0x3f
     test2 &v[1]: 0x7f3416da70c0   v[1]: 0x7f3416ddfaa0     0x7f3416ddfa40 0x7f3416ddfa58   1-6 -3 4   "file"
     test2 &v[2]: 0x7f3416da70c8   v[2]: 0x1

  "file" 1: ("a"; 4 5); 1: "file";
     test1 &v[0]: 0x7f3416da71b8   v[0]: 0x3f
     test1 &v[1]: 0x7f3416da71c0   v[1]: 0x7f3416ddfa60
     test1 &v[2]: 0x7f3416da71c8   v[2]: 0x1
                              *(V*)v[1]: 0x7f3416ddfac0

     test2 &v[0]: 0x7f3416da71b8   v[0]: 0x3f
     test2 &v[1]: 0x7f3416da71c0   v[1]: 0x7f3416ddfa60     0x7f3416ddfac0 0x7f3416ddfad8   1-6 -3 4   "file"
     test2 &v[2]: 0x7f3416da71c8   v[2]: 0x1

  "file" 1: ("a"; 4 5); 1: "file";
     test1 &v[0]: 0x7f3416da7138   v[0]: 0x3f
     test1 &v[1]: 0x7f3416da7140   v[1]: 0x7f3416ddfae0
     test1 &v[2]: 0x7f3416da7148   v[2]: 0x1
                              *(V*)v[1]: 0x7f3416ddfb00

     test2 &v[0]: 0x7f3416da7138   v[0]: 0x3f
     test2 &v[1]: 0x7f3416da7140   v[1]: 0x7f3416ddfae0     0x7f3416ddfb00 0x7f3416ddfb18   1-6 -3 4   "file"
     test2 &v[2]: 0x7f3416da7148   v[2]: 0x1

  "file" 1: ("a"; 4 5); 1: "file";
     test1 &v[0]: 0x7f3416da70b8   v[0]: 0x3f
     test1 &v[1]: 0x7f3416da70c0   v[1]: 0x7f3416da2068
     test1 &v[2]: 0x7f3416da70c8   v[2]: 0x1
                              *(V*)v[1]: 0x5

     test2 &v[0]: 0x7f3416da70b8   v[0]: 0x3f
Segmentation fault (core dumped)
$

The segfault occurs when attempting to display a K-structure at an address that only contains the number 0x5. It is still not clear why the K-structure is not at the address contained in v[1].

bakul commented 1 year ago

If you change "file" 1: ("a"; 4 5);1: "file" to "file" 1: ("a"; 4 5);2: "file" there is no segfault.

So the second statement 1:"file" makes the next "file" 1: ("a"; 4 5) segfault. Could it be that some cleanup is not done in the second statement?

tavmem commented 1 year ago

I think that you are right. The incomplete cleanup could also be the cause of the memory leak described in #634.

bakul commented 1 year ago

Running strace feeding k with one line of "file" 1: ("a"; 4 5);1: "file";. This line specific trace is as shown below. There seems to be 3 mmaps and 2 munmaps, which seems strange. Since the result of 1:"file" is not stored in a variable, there should be an munmap() corresponding to it.

openat(AT_FDCWD, "file.K", O_RDWR|O_CREAT|O_TRUNC, 0777) = 3
ftruncate(3, 112)                       = 0
mmap(NULL, 112, PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f8216783000
close(3)                                = 0
munmap(0x7f8216783000, 112)             = 0
openat(AT_FDCWD, "file.K", O_RDWR)      = 3
newfstatat(AT_FDCWD, "file.K", {st_mode=S_IFREG|0755, st_size=112, ...}, 0) = 0
mmap(NULL, 112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x7f8216783000
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=112, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x7f8216782000
close(3)                                = 0
munmap(0x7f8216783000, 112)             = 0

But even if we store the result, the segfault remains (after four lines of "file" 1: ("a"; 4 5);x:1: "file";)

bakul commented 1 year ago

Even if there is a missing munmap, there should be no segfault. If you forget to munmap over and over, eventually you will run out of virtual memory but there should be no segfault, which seems linux specific. May be some interaction between malloc/free and mmap.

Even if there is no segfault, the bug seems to be that when the refcount of an mmap object goes to zero, there is no "freeing" of mapped space by doing munmap(). If you construct an object in the program, you will manage its space for it by doing malloc/free but you can't malloc/free mmaped objects. May be there needs to be a flag in the object header indicating how it should be freed. [All this without reading the code so I could be wrong!]

tavmem commented 1 year ago

The one missing munmap can get a lot worse.

1m() calls mmap(), then calls 1m_r(), then calls munmap() 1m_r() calls mmap() but does not call munmap() However, 1m_r() has the line DO(n,x=_1m_r(f,fixed,v+r,aft,&r); so, 1m_r() can call itself multiple times, hence mmap() may be called multiple times with no corresponding call to munmap()

tavmem commented 1 year ago

With the 2 tests, we demonstrated that the problem shows up (during the 4th execution) in the function ex2(). But, does it start earlier?

ex2() is called by ex1() Putting the tests right at the beginning of ex1()

$ git diff
diff --git a/src/kx.c b/src/kx.c
index ffa44da..5855e87 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -772,6 +772,11 @@ Z K bv_ex(V*p,K k)

 K ex1(V*w,K k,I*i,I n,I f)//convert verb pieces (eg 1+/) to seven-types,
 { //default to ex2 (full pieces in between semicolons/newlines)
+  if(w[0]==(V)0x3f){
+    I ii; for(ii=0;ii<3;ii++)O("     test1 w[%lld]: %p\n",ii,w[ii]);
+    O("      *(V*)w[1]: %p\n",*(V*)w[1]); O("\n");
+    for(ii=0;w[ii];ii++) { O("     test2 w[%lld]: %p",ii,w[ii]);
+                           if( (w[ii]>(V)DT_SIZE) && *(V*)w[ii]>(V)DT_SIZE )sd(*(K*)w[ii]); else O("\n"); }; O("\n"); }
   if(offsetColon==w[0] && (UI)w[1]>DT_SIZE && (UI)w[2]>DT_SIZE && fwh==0)
   { fer=1;
     if(f)*i=n;
$

gives this result

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
     test1 w[0]: 0x3f
     test1 w[1]: 0x7fe4be73eaa0
     test1 w[2]: 0x1
      *(V*)w[1]: 0x7fe4be73ea40

     test2 w[0]: 0x3f
     test2 w[1]: 0x7fe4be73eaa0     0x7fe4be73ea40 0x7fe4be73ea58            1-6 -3 4   "file"
     test2 w[2]: 0x1

  "file" 1: ("a"; 4 5); 1: "file";
     test1 w[0]: 0x3f
     test1 w[1]: 0x7fe4be73ea60
     test1 w[2]: 0x1
      *(V*)w[1]: 0x7fe4be73eac0

     test2 w[0]: 0x3f
     test2 w[1]: 0x7fe4be73ea60     0x7fe4be73eac0 0x7fe4be73ead8            1-6 -3 4   "file"
     test2 w[2]: 0x1

  "file" 1: ("a"; 4 5); 1: "file";
     test1 w[0]: 0x3f
     test1 w[1]: 0x7fe4be73eae0
     test1 w[2]: 0x1
      *(V*)w[1]: 0x7fe4be73eb00

     test2 w[0]: 0x3f
     test2 w[1]: 0x7fe4be73eae0     0x7fe4be73eb00 0x7fe4be73eb18            1-6 -3 4   "file"
     test2 w[2]: 0x1

  "file" 1: ("a"; 4 5); 1: "file";
     test1 w[0]: 0x3f
     test1 w[1]: 0x7fe4be703068
     test1 w[2]: 0x1
      *(V*)w[1]: 0x5

     test2 w[0]: 0x3f
     test2 w[1]: 0x7fe4be703068
     test2 w[2]: 0x1

Segmentation fault (core dumped)
$

We see that, here also, in the 4th execution, the contents of w[1] is the number 0x5, instead of a K-structure. So ... the problem begins earlier than the call toex1().

tavmem commented 1 year ago

About 3 days ago Bakul postulated: "If you change "file" 1: ("a"; 4 5);1: "file" to "file" 1: ("a"; 4 5);2: "file" there is no segfault. So the second statement 1:"file" makes the next "file" 1: ("a"; 4 5) segfault. Could it be that some cleanup is not done in the second statement?"

We know the process blows up in the 4th iteration of the 2 statements. So ... if the supposition is true, the problem should begin before the 4th iteration. Indeed, we find that the process does diverge in the 3rd iteration.

If we add one line:

$ git diff
diff --git a/src/kx.c b/src/kx.c
index ffa44da..9e025ea 100644
--- a/src/kx.c
+++ b/src/kx.c
@@ -381,6 +381,7 @@ K dv_ex(K a, V *p, K b, V h)
     if(h && (*p>(V)DT_SIZE) && 0==(*(K*)*p)->n ) tmp=vf_ex(h,g); else tmp=vf_ex(*p,g);
     stk--;
     if(grnt && !prnt)prnt=ci(grnt); }
+  O("call memset   g:");sd(g);
   memset(kK(g),0,g->n*sizeof(K)); cd(g); //Special privileges here...don't ci() members beforehand
   R tmp; }

we get

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5); 1: "file";
call memset   g:     0x7fdbbc1fbb00 0x7fdbbc1fbb18            1-6 0 2   
("file"
 ("a"
  4 5))
call memset   g:     0x7fdbbc1fb680 0x7fdbbc1fb698            1-6 0 1   
,"file"

  "file" 1: ("a"; 4 5); 1: "file";
call memset   g:     0x7fdbbc1fbb40 0x7fdbbc1fbb58            1-6 0 2   
("file"
 ("a"
  4 5))
call memset   g:     0x7fdbbc1fb6c0 0x7fdbbc1fb6d8            1-6 0 1   
,"file"

  "file" 1: ("a"; 4 5); 1: "file";
call memset   g:     0x7fdbbc1c3048 0x7fdbbc1c3060            0-0 -1 2   4 5
call memset   g:     0x7fdbbc1c2048 0x7fdbbc1c2060            1-6 0 1   
,"file"

  "file" 1: ("a"; 4 5); 1: "file";
call memset   g:     0x7fdbbc1c1080 0x7fdbbc1c1098            1-6 0 2   
("file"
 ("a"
  4 5))
Segmentation fault (core dumped)
$

tavmem commented 1 year ago

We don't need to exeute the double-command line 4 times to spot a problem. A problem shows up after executing 3 single-command lines:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5)
   1: "file"
("a"
 4 5)
  "file" 1: ("a"; 4 5)
  \\
$ xxd file.K
00000000: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
$

If we execute a blank line command after the read command, the problem disappears:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  "file" 1: ("a"; 4 5)
   1: "file"
("a"
 4 5)

  "file" 1: ("a"; 4 5)
  \\
$ xxd file.K
00000000: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0200 0000 0000 0000  ................
00000020: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000030: 0300 0000 0000 0000 6100 0000 0000 0000  ........a.......
00000040: fdff ffff ffff ffff 0100 0000 0000 0000  ................
00000050: ffff ffff ffff ffff 0200 0000 0000 0000  ................
00000060: 0400 0000 0000 0000 0500 0000 0000 0000  ................
$

This is further evidence that the read command is missing some cleanup that the blank line command performs..

kevinlawler / kona

Alternate read problem in Linux #633