kevinlawler / kona

Open-source implementation of the K programming language
ISC License
1.36k stars 139 forks source link

Crash when trying to serialize data. #615

Closed gitonthescene closed 2 years ago

gitonthescene commented 2 years ago

If I rename scrabble-puz.txt to scrabble-puz.k, I can load it as a script. If I try to serialize the data with 1:, it seems to work just fine, but crashes when I attempt to deserialize it.

kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  #puz
8473
  `"scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"
[1]    92797 segmentation fault  ./k

But the behavior seems inconsistent. During another run it seemed to make it most of the way through but the data appeared corrupted (floats where there should be ints) and it eventually crashed. Here's a script of the session: crash.script.txt

Here's just one of the lines seemingly corrupted:

 ("abcelpy"
  8.497929e-322 4.150151e-322 3.705492e-322 6.47226e-322 7.756831e-322 5.03947e-322 3.903119e-322)

Where this is what that line looks like in scrabble-puz.k:

   ("abcelpy"
  172 84 75 131 157 102 79)
gitonthescene commented 2 years ago

As this might be an issue with how this relates to the operating system, it's probably worth mentioning that this is on osX 12.0.1.

tavmem commented 2 years ago

It works for the first 25 elements, and fails for the first 26:

kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  `"pxs" 1: puz[!25]
  pxd: 1: `"pxs"
  #pxd
25

  `"pxs" 1: puz[!26]
  pxd: 1: `"pxs"
Segmentation fault (core dumped)

It works for the 25 items (from 21 through 45), but fails for the 26 items (from 21 through 46):

kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  `"pxs" 1: puz[20+!25]
  pxd: 1: `"pxs"
  #pxd
25

  `"pxs" 1: puz[20+!26]
  pxd: 1: `"pxs"
Segmentation fault (core dumped)
gitonthescene commented 2 years ago

Now that you mention it, the corrupted lines in that crash log seem pretty regular. It seems they're each 256 lines apart. I know it's less useful because it wasn't reproducible, but I thought it worth pointing out.

➜  kona git:(master) ✗ grep -n e- crash.script.txt | tail -1            
16799:  9.683687e-322 6.422853e-322 6.422853e-322 7.410985e-322 9.930719e-322 6.077007e-322 9.881313e-322)
➜  kona git:(master) ✗ grep -n e- crash.script.txt | cut -d: -f1 | xargs
159 415 671 927 1183 1439 1695 1951 2207 2463 2719 2975 3231 3487 3743 3999 4255 4511 4767 5023 5279 5535 5791 6047 6303 6559 6815 7071 7327 7583 7839 8095 8351 8607 8863 9119 9375 9631 9887 10143 10399 10655 10911 11167 11423 11679 11935 12191 12447 12703 12959 13215 13471 13727 13983 14239 14495 14751 15007 15263 15519 15775 16031 16287 16543 16799
➜  kona git:(master) ✗ ./k                                              
kona      \ for help. \\ to exit.

  -': 159 415 671 927 1183 1439 1695 1951 2207 2463 2719 2975 3231 3487 3743 3999 4255 4511 4767 5023 5279 5535 5791 6047 6303 6559 6815 7071 7327 7583 7839 8095 8351 8607 8863 9119 9375 9631 9887 10143 10399 10655 10911 11167 11423 11679 11935 12191 12447 12703 12959 13215 13471 13727 13983 14239 14495 14751 15007 15263 15519 15775 16031 16287 16543 16799
256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256 256
gitonthescene commented 2 years ago

Speaking to the randomness here, this is what I get when I try lists of lengths 25, 26, 50 and 4200. Getting different results on different platforms feels a bit like a memory issue to me.

kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  `"pxs" 1: puz[!25]
  pxd: 1: `"pxs"
  #pxd
25
  `"pxs" 1: puz[!26]
  pxd: 1: `"pxs"
  #pxd
26
  `"pxs" 1: puz[!50]
  pxd: 1: `"pxs"
  #pxd
50
  #puz
8473
  `"pxs" 1: puz[!4200]
  pxd: 1: `"pxs"

rlwrap: warning: k crashed, killed by SIGSEGV.
gitonthescene commented 2 years ago

Here's what I get when I try to narrow down where the crash happens. And I was able to reproduce this several times in a row.

kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  `"pxs" 1: puz[!75]
  pxd: 1: `"pxs"
  `"pxs" 1: puz[!76]
  pxd: 1: `"pxs"
  -1#pxd
,("abcelpu"
  156 88 80 128 163 106 81)

  `"pxs" 1: puz[!77]
  pxd: 1: `"pxs"

rlwrap: warning: k crashed, killed by SIGSEGV.

Possibly the size of the file to that point is a factor?

kona git:(master) ✗ grep -n abcelpu scrabble-puz.k      
151: ("abcelpu"
kona git:(master) ✗ head -n 152 scrabble-puz.k| wc      
     152     608    3144
kona git:(master) ✗ head -n 154 scrabble-puz.k| wc    
     154     616    3184
kona git:(master) ✗ head -n 154 scrabble-puz.k| tail -n4
 ("abcelpu"
  156 88 80 128 163 106 81)
 ("abcelpy"
  172 84 75 131 157 102 79)
kona git:(master) ✗ 

Probably more relevant is the size of the produced file.

kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  `"pxs76" 1: puz[!76]
  pxd: 1: `"pxs76"
  `"pxs77" 1: puz[!77]
  pxd: 1: `"pxs77"

rlwrap: warning: k crashed, killed by SIGSEGV.
rlwrap itself has not crashed, but for transparency,
it will now kill itself (without dumping core) with the same signal

warnings can be silenced by the --no-warnings (-n) option
[1]    2430 segmentation fault  rlwrap ./k
kona git:(master) ✗ wc pxs7*
       0      12   12192 pxs76.l
       0      12   12352 pxs77.l
       0      24   24544 total
kona git:(master) ✗ 
gitonthescene commented 2 years ago

There's not much surprising in the difference between the output files so I'd guess the issue is not on the serialization side.

kona git:(master) ✗ hexdump pxs76.l > 76.hex
kona git:(master) ✗ hexdump pxs77.l > 77.hex
kona git:(master) ✗ diff -u 7{6,7}.hex     
--- 76.hex  2022-01-09 16:25:08.000000000 +0900
+++ 77.hex  2022-01-09 16:25:11.000000000 +0900
@@ -1,5 +1,5 @@
 0000000 fd ff ff ff ff ff ff ff 01 00 00 00 00 00 00 00
-0000010 00 00 00 00 00 00 00 00 4c 00 00 00 00 00 00 00
+0000010 00 00 00 00 00 00 00 00 4d 00 00 00 00 00 00 00
 0000020 fd ff ff ff ff ff ff ff 01 00 00 00 00 00 00 00
 0000030 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
 0000040 fd ff ff ff ff ff ff ff 01 00 00 00 00 00 00 00
@@ -760,4 +760,14 @@
 0002f70 58 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00
 0002f80 80 00 00 00 00 00 00 00 a3 00 00 00 00 00 00 00
 0002f90 6a 00 00 00 00 00 00 00 51 00 00 00 00 00 00 00
-0002fa0
+0002fa0 fd ff ff ff ff ff ff ff 01 00 00 00 00 00 00 00
+0002fb0 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
+0002fc0 fd ff ff ff ff ff ff ff 01 00 00 00 00 00 00 00
+0002fd0 fd ff ff ff ff ff ff ff 07 00 00 00 00 00 00 00
+0002fe0 61 62 63 65 6c 70 79 00 fd ff ff ff ff ff ff ff
+0002ff0 01 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff
+0003000 07 00 00 00 00 00 00 00 ac 00 00 00 00 00 00 00
+0003010 54 00 00 00 00 00 00 00 4b 00 00 00 00 00 00 00
+0003020 83 00 00 00 00 00 00 00 9d 00 00 00 00 00 00 00
+0003030 66 00 00 00 00 00 00 00 4f 00 00 00 00 00 00 00
+0003040
kona git:(master) ✗ 
tavmem commented 2 years ago

Not a problem ... this issue is reopened. Thanks for pointing it out. You can "reopen" an issue in the new comment area on a closed issue. I agree with you ... it does feel like a memory issue.

I'll be tied up most of the day with family activities. Thanks for continuing to look into this!

tavmem commented 2 years ago

To replicate the error in https://github.com/tavmem/ks, I

Then, in ks:

\l sp.k
`"pxs" 1: puz
pxd: 1: `"pxs"

Obviously, the final command is parsed, then executed, and the segfault occurs. The output from the beginning of the execution step is:

    ~BR ex(zz)      K ex(K a) <- I line(FILE*f, S*a, I*n, PDA*p)      BEG ex 
    sd_(a,2):     0x7f2044ea5000 0x7f2044ea5018            1-7 7 0   
         a0:    0x7f2044ea5018     .k
         a1:    0x7f2044ea5020     (nil)
         a2:    0x7f2044ea5028     0x7f2044eaab00 0x7f2044eaab18            1-6 -4 5   `"@��D " 0x3c 
    0x3f `"���D " (nil)  
         .2a[0]: 0x7f2044b5a9a0     0x7f2044eaa040 0x7f2044eaa058            9-6 6 0   
         .2a[1]: 0x3c
         .2a[2]: 0x3f
         .2a[3]: 0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
         a3:    0x7f2044ea5030     0x7f2044eaa680 0x7f2044eaa698            1-6 5 1   
    .,(`;`pxs;)
     0x7f2044eaa698     0x7f2044eaa900 0x7f2044eaa918            1-6 0 3   
    (`;`pxs;)
     0x7f2044eaa928     0x7f2044eaaa00 0x7f2044eaaa18 0x5296e0  1-6 4 1   `
     0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
     0x7f2044eaa918     0x7f2044eaa040 0x7f2044eaa058            9-6 6 0   
         a4:    0x7f2044ea5038     0x7f2044eaa640 0x7f2044eaa658            1-6 5 0   
    .()
         a5:    0x7f2044ea5040     
         a6:    0x7f2044ea5048     
         a7:    0x7f2044ea5050     

    exA 1 *****************************************************************************

      ~AA ex_(&a,0)      V ex_(V a, I r) <- K ex(K a)    BEG ex_
         r:0
        ~AB ex0(kW(x),y,r)      K ex0(V*v,K k,I r) <- V ex_(V a, I r)      BEG ex0
             r: 0     sd(k):     
             ex0 v[0]: 0x7f2044b5a9a0     0x7f2044eaa040 0x7f2044eaa058            9-6 6 0   
             ex0 v[1]: 0x3c
             ex0 v[2]: 0x3f
             ex0 v[3]: 0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
             z==0 (no Asd(z))
              cd -- !x
          ~AC ex1(v+1+i,0,&i,n,1)      K ex1(V*w, K k, I *i, I n, I f) <- K ex0(V*v, K k, I r)      i: -1     BEG ex1
               i: 0x7ffe7c032c98       n: 4      f: 1     sd(k):     
               ex1 w[0]: 0x7f2044b5a9a0     0x7f2044eaa040 0x7f2044eaa058            9-6 6 0   
               ex1 w[1]: 0x3c
               ex1 w[2]: 0x3f
               ex1 w[3]: 0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
            ~AD ex2(w,k)      K ex2(V*v, K k) <- K ex1(V*w, K k, I *i, I n, I f)      BEG ex2
                 sd(k):     
                 ex2 v[0]: 0x7f2044b5a9a0     0x7f2044eaa040 0x7f2044eaa058            9-6 6 0   
                 ex2 v[1]: 0x3c
                 ex2 v[2]: 0x3f
                 ex2 v[3]: 0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
            BEG newK
              ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 24   r: 0x7ffe7c032b78
                ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 24   r: 6
                  ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                      *L: 0x7f2044eaa940   mUsed: 9536.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaaa40   done
                  #FI kallocI :: unpool(r)
                #FH kalloc :: kallocI(k,*r)
              #FG newK :: kalloc(k,&r)
            BEG Kv
              ~DN newK(7,TYPE_SEVEN_SIZE)   K newK(I t, I n) <- K Kv()      BEG newK
                ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 88   r: 0x7ffe7c032b68
                  ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 88   r: 7
                    ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 7
                        *L: 0x7f2044ea5080   mUsed: 9600.000000   k: 128   *z=0;   r: 7   *z: 0x7f2044ea5100   done
                    #FI kallocI :: unpool(r)
                  #FH kalloc :: kallocI(k,*r)
                #FG newK :: kalloc(k,&r)
              #DN Kv :: newK(7,TYPE_SEVEN_SIZE)
              ~DO Kd()   K Kd() <- K Kv()      BEG newK
                ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 24   r: 0x7ffe7c032b68
                  ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 24   r: 6
                    ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                        *L: 0x7f2044eaaa40   mUsed: 9728.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaa9c0   done
                    #FI kallocI :: unpool(r)
                  #FH kalloc :: kallocI(k,*r)
                #FG newK :: kalloc(k,&r)
              #DO Kv :: Kd()  --  PARAMS
              ~DP Kd()   K Kd() <- K Kv()      BEG newK
                ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 24   r: 0x7ffe7c032b68
                  ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 24   r: 6
                    ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                        *L: 0x7f2044eaa9c0   mUsed: 9792.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaaa80   done
                    #FI kallocI :: unpool(r)
                  #FH kalloc :: kallocI(k,*r)
                #FG newK :: kalloc(k,&r)
              #DP Kv :: Kd()  --  LOCALS
            BEG newK
              ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032b78
                ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                  ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                      *L: 0x7f2044eaaa80   mUsed: 9856.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaaac0   done
                  #FI kallocI :: unpool(r)
                #FH kalloc :: kallocI(k,*r)
              #FG newK :: kalloc(k,&r)
              ~AE ex1(v+(offsetColon==v[1]?2:3),k,0,0,1)      K ex1(V*w,K k,I*i,I n,I f) <- K ex2(V*v, K k)      BEG ex1
                   i: (nil)       n: 0      f: 1     sd(k):     
                   ex1 w[0]: 0x3f
                   ex1 w[1]: 0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
                ~AD ex2(w,k)      K ex2(V*v, K k) <- K ex1(V*w, K k, I *i, I n, I f)      BEG ex2
                     sd(k):     
                     ex2 v[0]: 0x3f
                     ex2 v[1]: 0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
                vn case before RESET, if prnt:
                  ~AI ex2(v+1+i,k)      K ex2(V*v, K k) <- K ex2(V*v, K k)      k: is 0      BEG ex2
                       sd(k):     
                       ex2 v[0]: 0x7f2044eaa920     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
                    ~AJ ex_(*v,1)      V ex_(V a, I r) <- K ex2(V*v, K k)      BEG ex_
                       r:1
                       sd(x=*(K*)a):       0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  1-6 4 1   `pxs
                        R ci(x)
                      BEG ci     END ci  0x7ffe7c032a48     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  2-6 4 1   `pxs
                    #AJ ex2 :: ex_(*v,1)
                     AJ:     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  2-6 4 1   `pxs
                  #AI ex2 :: ex2(v+1+i,k)
                   AI:     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  2-6 4 1   `pxs
                  ~AK ex_(*v,1)      V ex_(V a, I r) <- K ex2(V*v, K k)      BEG ex_
                     r:1
                      R a: 0x3f
                  #AK ex2 :: ex_(*v,1)
                   AK: 0x3f
                  ~AL dv_ex(0,v+i,t2)      K dv_ex(K a, V *p, K b) <- K ex2(V*v, K k)      BEG dv_ex
                      sd(a):     
                      sd(b):     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  2-6 4 1   `pxs
                      dvx p[0]: 0x3f
                  output limited to 1
                  BEG newK
                    ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 32   r: 0x7ffe7c0329f8
                      ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 32   r: 6
                        ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                            *L: 0x7f2044eaaac0   mUsed: 9920.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaa980   done
                        #FI kallocI :: unpool(r)
                      #FH kalloc :: kallocI(k,*r)
                    #FG newK :: kalloc(k,&r)

                  sd_(prnt,2):     

                    ~AM vf_ex(*p,g)      K vf_ex(V q, K g) <- K dv_ex(K a, V *p, K b)      BEG vf_ex
                       q:         0x3f
                       sd(g):     0x7f2044eaaac0 0x7f2044eaaad8            1-6 0 1   
                    ,`pxs
                      BEG ci   
                      BEG ci     END ci  0x7ffe7c0328f8     0x7f2044eaa8c0 0x7f2044eaa8d8 0x533420  3-6 4 1   `pxs
                                 END ci  0x7ffe7c032928     0x7f2044eaaac0 0x7f2044eaaad8            2-6 0 1   
                    ,`pxs
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 232   r: 0x7ffe7c0327e8
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 232   r: 8
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 8
                              *L: 0x7f2044e9c000   mUsed: 9984.000000   k: 256   *z=0;   r: 8   *z: 0x7f2044e9c200   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaa980   mUsed: 10240.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaabc0   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaabc0   mUsed: 10664.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaad80   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaad80   mUsed: 11408.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaab40   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaab40   mUsed: 12472.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaab80   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaab80   mUsed: 13856.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaac80   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaac80   mUsed: 15560.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaacc0   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaacc0   mUsed: 17584.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaac40   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaac40   mUsed: 19928.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaad00   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaad00   mUsed: 22592.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaad40   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaad40   mUsed: 25576.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaac00   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaac00   mUsed: 28880.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaae40   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaae40   mUsed: 32504.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044e9e000   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044e9e000   mUsed: 36448.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaadc0   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaadc0   mUsed: 40712.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaae00   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaae00   mUsed: 45296.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaaf00   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaaf00   mUsed: 50200.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaaf40   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaaf40   mUsed: 55424.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaaec0   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaaec0   mUsed: 60968.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaaf80   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaaf80   mUsed: 66832.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaafc0   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaafc0   mUsed: 73016.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044eaae80   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044eaae80   mUsed: 79520.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044e9e0c0   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044e9e0c0   mUsed: 86344.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044e9e280   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044e9e280   mUsed: 93488.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044e9e040   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044e9e040   mUsed: 100952.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044e9e080   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044e9e080   mUsed: 108736.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044e9e180   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    BEG newK
                      ~FG kalloc(k,&r)      V kalloc(I k,I*r) <- K newK(I t, I n)      BEG kalloc   k: 40   r: 0x7ffe7c032788
                        ~FH kallocI(k,*r)      V kallocI(I k,I r) <- V kalloc(I k,I*r)      BEG kallocI   k: 40   r: 6
                          ~FI unpool(r)      V unpool(I r) <- V kallocI(I k,I r)      BEG unpool   r: 6
                              *L: 0x7f2044e9e180   mUsed: 116840.000000   k: 64   *z=0;   r: 6   *z: 0x7f2044e9e1c0   done
                          #FI kallocI :: unpool(r)
                        #FH kalloc :: kallocI(k,*r)
                      #FG newK :: kalloc(k,&r)
                    Segmentation fault (core dumped)
                    [tom@localhost ks]$ 

Note that before the segfault, we have ~AM vf_ex(*p,g) K vf_ex(V q, K g) <- K dv_ex(K a, V *p, K b) BEG vf_ex then, the function newK is called 27 times (for some unknown reason), and then the segfault, after kalloc returns to newK on the 27th iteration.

gitonthescene commented 2 years ago

I'll upload the whole hex file in case it's useful. This is the version that segfaulted on my computer. I don't expect the serialization to be the issue, but you might want to compare your output with this. Mine is using 76 items from sudoku-puzzles.k and so I expect it to differ at the second line where the length of the list is included and with any extra or fewer lines depending on what you're comparing it to, much like the example I posted above.

pxs77.hex.txt

tavmem commented 2 years ago

Thanks ...

The segfault occurs when executing line 465 of src/kx.c

if(1==k && a) { z=((K(*)(K))DT[(L)q].func)(a); GC;}
tavmem commented 2 years ago

Just as verification ... if you replace that line with

  if(1==k && a)
  { O("k: %lld      ",k); sd_(a,2);
    z=((K(*)(K))DT[(L)q].func)(a);
    O("going to cleanup\n");
    GC;}

then you get:

kona      \ for help. \\ to exit.

  \l sp.k
  `"pxs" 1: puz
  pxd: 1: `"pxs"
k: 1           0x7f0eddbcd900 0x7f0eddbcd918 0x6c8420  3-6 4 1   `pxs
Segmentation fault (core dumped)
tavmem commented 2 years ago

and ... if you try the 25 element version (that does not crash):

kona      \ for help. \\ to exit.

  \l sp25.k
  `"pxs25b" 1: puz
  pxd25b: 1: `"pxs25b"
k: 1           0x7f4d0730d900 0x7f4d0730d918 0x1244400  3-6 4 1   `pxs25b
going to cleanup
  \\
tavmem commented 2 years ago

if you add a beginning and ending comment line to the function newK in src/km.c

K newK(I t, I n)
{ O("BEG newK      ");
  K z;
  if(n>0 && n>MAX_OBJECT_LENGTH)R ME;//coarse (ignores bytes per type). but sz can overflow
  I k=sz(t,n),r;
  U(z=kalloc(k,&r))
  //^^ relies on MAP_ANON being zero-filled for 0==t || 5==t (cd() the half-complete), 3==ABS(t) kC(z)[n]=0 (+-3 types emulate c-string)
  ic(slsz(z,r)); z->t=t; z->n=n;
  if(t==6)z->n=0;
  if(z->_c==0)z->_c=256;
  #ifdef DEBUG
  krec[kreci++]=z;
  #endif
  O("RTRN from newK\n");
  R z;
}

then the final command pxd26b: 1: `"pxs26b" displays the 27 executions of newK just before the crash

k: 1           0x7f2f62dbb900 0x7f2f62dbb918 0x1233420  3-6 4 1   `pxs26b
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
BEG newK      RTRN from newK
Segmentation fault (core dumped)
tavmem commented 2 years ago

The next step is to check whether the current problem is caused by a regression. It does not appear that the 8473 file every worked correctly ... but it has gotten worse over time. (All of the below was run using Fedora. You may get different results on another OS.)

Using commit 30547b8c22573ed4156f9aeb8ae7bc5f71d650a5 made on Sep 25, 2013

tavmem commented 2 years ago

I tried the commit 611fdca7d77fd43fe1f30089cf18d8edef6b415b made on Jun 23, 2011 in both Fedora and MacOS. On Fedora, I got the same result as on the commit of Sep 25, 2013, i.e., intermittent bad records:

[tom@localhost k110623]$ rlwrap -n ./k
K Console - Enter \ for help

  \l scrabble-puz.k
  `"scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"

  #puz
8473
  #puz2
8473

  puz[25]
("abcdnos"
 219 140 114 103 151 197 212)
  puz2[25]
(
 219 140 114 103 151 197 212)

  puz[76]
("abcelpy"
 172 84 75 131 157 102 79)
  puz2[76]
("abcelpy"
 8.497929e-322 4.150151e-322 3.705492e-322 6.47226e-322 7.756831e-322 5.03947e-322 3.903119e-322)

On MacOS, I got a segfault:

k110623 % ./k
kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  #puz
8473
  `"scrabblepuz" 1: puz                
  puz2: 1: `"scrabblepuz"
zsh: segmentation fault  ./k
tavmem commented 2 years ago

On Fedora, I tried commit ce4dd972f184311561d01455fc4f55d1a6f1d9fd made on Apr 2, 2011, and also got

tom@localhost k110402]$ rlwrap -n ./k
K Console - Enter \ for help

  \l scrabble-puz.k
  `"scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"

  #puz
8473
  #puz2
8473

  puz[25]
("abcdnos"
 219 140 114 103 151 197 212)
  puz2[25]
(
 219 140 114 103 151 197 212)

  puz[76]
("abcelpy"
 172 84 75 131 157 102 79)
  puz2[76]
("abcelpy"
 8.497929e-322 4.150151e-322 3.705492e-322 6.47226e-322 7.756831e-322 5.03947e-322 3.903119e-322)

On Fedora, I tried commit c4040e5df1c656ef659c13213ca137c8a8b8b625 made on Jan 28, 2011 and got the same results.

On Fedora, I tried commit cb10602b75102330010f783055c9d2dcff42e06a made on Dec 31, 2010 and got the same results

On Fedora, I tried the first commit ever made, 3bd9d8d0711cdb71b416c6e271dd3768cbb161ba from Aug 25, 2010:

[tom@localhost k100825a]$ gcc k.c

[tom@localhost k100825a]$ rlwrap -n ./k
K Console - Enter \ for help

  \l scrabble-puz.k
  `"scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"

  #puz
8473
  #puz2
8473

  puz[25]
("abcdnos"
 219 140 114 103 151 197 212)
  puz2[25]
(
 219 140 114 103 151 197 212)

  puz[76]
("abcelpy"
 172 84 75 131 157 102 79)
  puz2[76]
("abcelpy"
 8.497929e-322 4.150151e-322 3.705492e-322 6.47226e-322 7.756831e-322 5.03947e-322 3.903119e-322)

Next step; Examine the first commit further, & attempt to find the initial problem.

tavmem commented 2 years ago

This is what I found: It is a memory problem. Add these 2 lines to src/0.m in the latest kona

[tom@localhost kona]$ git diff
diff --git a/src/0.c b/src/0.c
index 63d5280..da54906 100644
--- a/src/0.c
+++ b/src/0.c
@@ -502,6 +502,7 @@ K _1m(K x) {    //Keeps binary files mapped

   S v;
   //These mmap arguments are present in Arthur's code. WRITE+PRIVATE lets reference count be modified without affecting file
+  O("_1m    mmap(address:0,   length:%lld,   PROTECT   FLAGS  filedes:%lld   offset:0\n",s,f);
   if(MAP_FAILED==(v=mmap(0,s,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,f,0)))R SE;

   //TODO: verify that the file is valid K data. For -1,-2,-3 types (at least) you can avoid scanning the whole thing and check size
@@ -538,6 +539,7 @@ Z K _1m_r(I f,V fixed, V v,V aft,I*b) {   //File descriptor, moving * into mmap,
     length+=mod;
     offset-=mod;

+    O("_1m_r  mmap(addfress:0,   length:%lld,   PROTECT   FLAGS  filedes:%lld   offset:%lld\n",length,f,offset);
     if(MAP_FAILED==(u=mmap(0,length,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,f,offset))){R SE;}
     mMap+=length;
     mUsed+=length;if(mUsed>mMax)mMax=mUsed;

Then, in Fedora, you get

[tom@localhost kona]$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  `"scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"
_1m    mmap(address:0,   length:1355712,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:136,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:224,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:296,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:384,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:456,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:544,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:616,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:704,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:776,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:864,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:936,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1024,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1096,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1184,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1256,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1344,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1416,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1504,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1576,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1664,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1736,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1824,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1896,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:1984,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2056,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2144,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2216,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2304,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2376,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2464,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2536,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2624,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2696,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2784,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2856,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:2944,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3016,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3104,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3176,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3264,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3336,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3424,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3496,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3584,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3656,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3744,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3816,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3904,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:3976,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:4064,   PROTECT,   FLAGS,  filedes:3,   offset:0
_1m_r  mmap(addfress:0,   length:40,   PROTECT,   FLAGS,  filedes:3,   offset:4096
Segmentation fault (core dumped)

In Fedora, the PAGESIZE is 4096 The first execution of mmap in _1m sets a mapping of the complete file into the string v Both the open filedes and the mapping v are passed to 1m_r for recursive reads Reads are done to retrieve sub-strings alternately of length 72, 88, 72, 88, 72, 88 ... for the text and for the numbers. The last read attempted before the segfault crosses a page boundary. We get the segfault with the attempt to use the result of that read to create a K-structure.

The sub-string that is being sought has length 72. 32 of the bytes are assumed to be in the first page (4096 - 4064 = 32) The remaining 40 bytes are assumed to be in the 2nd page.

Next step: Try to figure out how to successfully do reads that cross a page boundary. (Maybe the problem could be in the serialization step that stored the file, making the subseqeunt cross-page reads problematic?)

gitonthescene commented 2 years ago

Perhaps this is useful?

tavmem commented 2 years ago

Thanks ... it appears very relevant! I found this interesting:

There is a lot of code lying around, which was written twenty years ago and only ever ran on Intel. 
This code may suddenly crash in a similar fashion. 
One practical advice is to disable all possible instruction set extensions while compiling such code – 
however, even that may turn out insufficient.

It' may be possible that (when originally written back in 2010, or earlier) the code may have worked fine.

tavmem commented 2 years ago

This also appears relevant.

And here is a curious conclusion:

Conclusion:

The mmap() is a powerful system call. 
This function should not be used when there are portability issues because 
this function is only supported by the Linux environment. 
tavmem commented 2 years ago

Take a file of only 26 scrabble elements (in Fedora). We get a segfault in step 3.

$rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l sp26.k
  `"scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"
Segmentation fault (core dumped)
$

However, we don't know whether the result of step 2 is correct (or not). The Jan 14 comment demonstrates that the mmap calls in step 3 are made by the functions _lm and _lm_r in src/0.c What functions are used in step 2?

Making these changes to src/0.c

diff --git a/src/0.c b/src/0.c
index 63d5280..e5e3009 100644
--- a/src/0.c
+++ b/src/0.c
@@ -480,6 +480,7 @@ K _1m(K x) {    //Keeps binary files mapped
   //See 'scratch.txt' for an Arthur implementation of this

   //Largely Copy/pasted from various I/O functions
+  O("beg _lm\n");
   P(4!=xt && 3!=ABS(xt),TE)

   S m=CSK(x); //looks for .K or .L extensions first
@@ -502,6 +503,7 @@ K _1m(K x) {    //Keeps binary files mapped

   S v;
   //These mmap arguments are present in Arthur's code. WRITE+PRIVATE lets reference count be modified without affecting file
+  O("_1m    mmap(address:0,   length:%lld,   PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,  file:%lld   offset:0\n)",s,f);
   if(MAP_FAILED==(v=mmap(0,s,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,f,0)))R SE;

   //TODO: verify that the file is valid K data. For -1,-2,-3 types (at least) you can avoid scanning the whole thing and check size
@@ -513,6 +515,7 @@ K _1m(K x) {    //Keeps binary files mapped
 }

 Z K _1m_r(I f,V fixed, V v,V aft,I*b) {   //File descriptor, moving * into mmap, fixed * to last mmapped+1, bytes read
+  O("beg _lm_r\n");
   I s=aft-v; //subtle but signed not big enough to hold max difference here
   if(s < 4*sizeof(I)) R NE; // file is malformed

@@ -538,6 +541,7 @@ Z K _1m_r(I f,V fixed, V v,V aft,I*b) {   //File descriptor, moving * into mmap,
     length+=mod;
     offset-=mod;

+    O("_1m_r  mmap(addfress:0,   length:%lld,   PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,  file:%lld   offset:%lld)\n",length,f,offset);
     if(MAP_FAILED==(u=mmap(0,length,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,f,offset))){R SE;}
     mMap+=length;
     mUsed+=length;if(mUsed>mMax)mMax=mUsed;
@@ -554,6 +558,7 @@ Z K _1m_r(I f,V fixed, V v,V aft,I*b) {   //File descriptor, moving * into mmap,
 }

 K _1d(K x,K y) {
+  O("beg _ld\n");
   I t=x->t;
   if(4==t || -3==t)R _1d_write(x,y,0); //char-vector but not char-atom
   if(!t)R _1d_read(x,y);
@@ -564,6 +569,7 @@ K _1d(K x,K y) {
 Z K _1d_write(K x,K y,I dosync) {
   //Note: all file objects must be at least 4*sizeof(I) bytes...fixes bugs in K3.2, too
   //K3.2 Bug - "a"1:`a;2:"a" or 1:"a" - wsfull, tries to read sym but didn't write enough bytes?
+  O("beg _ld_write\n");
   I n=disk(y);

   //Copy-pasted from 2:
@@ -581,6 +587,7 @@ Z K _1d_write(K x,K y,I dosync) {
   P(ftruncate(f,n),SE)
   //lfop: see 0: write for possible way to do ftruncate etc. on Windows
   S v;
+  O("_1d_write  mmap(addfress:0,   length:%lld,   PROT_WRITE,MAP_SHARED,   file:%lld   offset:0)\n",n,f);
   if(MAP_FAILED==(v=mmap(0,n,PROT_WRITE,MAP_SHARED,f,0)))R SE; // should this be MAP_PRIVATE|MAP_NORESERVE ?
   I r=close(f); if(r)R FE;

@@ -593,6 +600,7 @@ Z K _1d_write(K x,K y,I dosync) {
 }

 I wrep(K x,V v,I y) {   //write representation. see rep(). y in {0,1}->{net, disk}
+  O("beg wrep\n");
   I t=xt, n=xn;
   I* w=(I*)v;

(END)

and using a scrabble file with only 2 elements:

$cat sp2.k
puz:(("abcdeht"
  176 79 106 111 184 125 143)
 ("abcdehu"
  105 73 77 102 109 57 63))
$

we get

$rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l sp2.k

  `"scrabblepuz" 1: puz
beg _ld
beg _ld_write
_1d_write  mmap(addfress:0,   length:352,   PROT_WRITE,MAP_SHARED,   file:3   offset:0)
beg wrep
beg wrep
beg wrep
beg wrep
beg wrep
beg wrep
beg wrep

  puz2: 1: `"scrabblepuz"
beg _lm
_1m    mmap(address:0,   length:352,   PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,  file:3   offset:0)
beg _lm_r
beg _lm_r
beg _lm_r
_1m_r  mmap(addfress:0,   length:136,   PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,  file:3   offset:0)
beg _lm_r
_1m_r  mmap(addfress:0,   length:224,   PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,  file:3   offset:0)
beg _lm_r
beg _lm_r
_1m_r  mmap(addfress:0,   length:296,   PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,  file:3   offset:0)
beg _lm_r
_1m_r  mmap(addfress:0,   length:384,   PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,  file:3   offset:0)

Step 2 uses the functions _ld, _ld_write, and wrep. Step 2 only uses 1 call to mmap to write the serialized data file with 2 elements. wrepuses memcpy. There are 7 calls towrep. Step 3 uses 5 mmap calls to retreive the serialized data file with 2 elments. There are 7 calls to _lm_r.

tavmem commented 2 years ago

The problem is not in Step 2 ... the problem is in STEP 3. I reach this conclusion in the following manner: Run the whole process twice:

Examine the output of Step 2 in each case using ghex In both cases the last records, which include the final element and end-of-file, are coded in hex as:

61 62 63 64 6E 6F 73 00
FD FF FF FF FF FF FF FF
01 00 00 00 00 00 00 00
FF FF FF FF FF FF FF FF
07 00 00 00 00 00 00 00
DB 00 00 00 00 00 00 00
8C 00 00 00 00 00 00 00
72 00 00 00 00 00 00 00 
67 00 00 00 00 00 00 00
97 00 00 00 00 00 00 00
C5 00 00 00 00 00 00 00
D4 00 00 00 00 00 00 00
gitonthescene commented 2 years ago

That matches my intuition since the binary dumps of both the crashing and non-crashing serializations seem to have the exact same form.

tavmem commented 2 years ago

If you make this change to show what is captured my mmap:

$git diff
diff --git a/src/0.c b/src/0.c
index 63d5280..7b375e2 100644
--- a/src/0.c
+++ b/src/0.c
@@ -539,6 +539,7 @@ Z K _1m_r(I f,V fixed, V v,V aft,I*b) {   //File descriptor, moving * into mmap,
     offset-=mod;

     if(MAP_FAILED==(u=mmap(0,length,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_NORESERVE,f,offset))){R SE;}
+    O("u:%s\n",u);
     mMap+=length;
     mUsed+=length;if(mUsed>mMax)mMax=mUsed;

you get

$rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  `"scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:��������
u:abcdnos

After the page change, the mmap read is incorrect.

tavmem commented 2 years ago

The commit made on Mar 28, 2022 seems to fix this issue on Linux. I haven't tried it yet on Windows nor on MacOS

gitonthescene commented 2 years ago

Awesome. I don't have access to a Windows machine, but it worked for me on my Mac. I serialized the object without crashing and then reloaded it and checked that it was the same as what was saved.

kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  #puz
8473
  `"scrabblepuz" 1: puz
  puz2:1: `"scrabblepuz"
  puz2 ~ puz
1
tavmem commented 2 years ago

Thanks for checking the result on Mac!

I have a Windows machine that I haven't used for a quite a while and I can't get kona to compile:

tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ make
OS="mingw64_nt-10.0-22000"
cc -g -O3   src/main.o -o k
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/10.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: src/main.o: in function `main':
C:\msys64\home\tavme\kona/src/main.c:6: undefined reference to `kinit'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/10.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\home\tavme\kona/src/main.c:7: undefined reference to `args'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/10.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\home\tavme\kona/src/main.c:8: undefined reference to `attend'
collect2.exe: error: ld returned 1 exit status
make: *** [Makefile:120: k] Error 1

Probably something I'm forgetting to do.

tavmem commented 2 years ago

Got kona to compile on Windows.

Since I had not used the Windows machine for a while, I began by updating mingw64. As you can see from above, the updated version is "mingw_nt-10.0-22000" However, the Makefile specified (and was looking for) "mingw_nt-10.0-18363" Once I updated the Makefile, kona compiiled.

In any event ... I got a segfault.

tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  "scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"
Segmentation fault

I will try to figure out why.

tavmem commented 2 years ago

I added a line to display the pagesize on Windows:

diff --git a/src/0.c b/src/0.c
index 6ba3663..8a75d59 100644
--- a/src/0.c
+++ b/src/0.c
@@ -480,6 +480,7 @@ K _1m(K x) {    //Keeps binary files mapped
   //See 'scratch.txt' for an Arthur implementation of this

   //Largely Copy/pasted from various I/O functions
+  O("PG:%lld\n",PG);
   P(4!=xt && 3!=ABS(xt),TE)

   S m=CSK(x); //looks for .K or .L extensions first

Somewhat surprisingly, the pagesize is the same as on Linux (4096), but the failure happens on element 102

tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  "scrabblepuz" 1: puz[!101]
  puz2: 1: `"scrabblepuz"
PG:4096
tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  "scrabblepuz" 1: puz[!102]
  puz2: 1: `"scrabblepuz"
Segmentation fault

It's also a bit surprising that the pagesize did not print for 102 elements. I had placed the print statemnt at the beginning of the function _lm before the mmap statements.

tavmem commented 2 years ago

More strange stuff in Windows

tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l sp102.k
  "scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"
Segmentation fault

tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  puz2: 1: `"scrabblepuz"
  puz2[97+!5]
(("abcfikt"
  80 36 37 23 57 42 56)
 ("abcfilo"
  125 85 93 62 82 134 113)
 ("abcfiot"
  86 62 75 33 64 87 84)
 ("abcfirs"
  213 118 111 65 148 161 185)
 ("abcfkly"
  ()))

sp102.k is is the scrabble-puz file with only 102 elements. The 3 steps results in a segfault ... as expected. But ... if you immediately restart kona in Windows and try step 3 ... it appears to work. However ... element 102 is corrupted. Did the corruption occur in step 2 or in step 3?

gitonthescene commented 2 years ago

My guess is that the serialized output should match regardless of which platform it was produced on. Comparing the output generated across platforms should give some insight.

tavmem commented 2 years ago

I agree ... I will probably try that next when I get back to this. I tried a couple of other things first, which didn't work out, and were frustrating.

  1. Windows Power Shell is supposed to have the capability of viewing hex files. When I tried it on the serialized data file. It reported that only the first 2 records wre uncorrupted, and that the remaining 100 records were null, which doesn't seem correct.
  2. Mingw64 supposedly allows you to install ghex. I tried. There were some "unverifed" files that get delivered in the download, and Mingw64 refused to install ghex. I tried twice.
tavmem commented 2 years ago

The filles differ:

$ cmp scrabblepuz.K spWin.l 
scrabblepuz.K spWin.l differ: byte 25, line 1

scrabblepuz.K is the serialized file created on Fedora using 102 elements. spWin.l is the serialized fille created on Windows using 102 elements.

(Surprising that the diff occurs so early.)

gitonthescene commented 2 years ago

Rats. That’s not necessarily the problem but it’s one more thing to understand.

tavmem commented 2 years ago

No ... it looks like there is a problem in step 2 on Windows.

I installed the HxD hex editor on Windows. It gave me the same result as Windows Power Shell. The hex dump of step 2 on Windows only displays the first 2 elements. All bytes past byte 410(octal) get displayed as 00.

The first byte that differs is indeed 25(decimal), i.e. 30(octal).

             WINDOWS                               LINUX
offset(o) 00 01 02 03 04 05 06 07  decoded      00 01 02 03 04 05 06 07  decoded
00000000  FD FF FF FF FF FF FF FF  ^^^^^^^^     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000010  01 00 00 00 00 00 00 00  --------     01 00 00 00 00 00 00 00  --------
00000020  00 00 00 00 00 00 00 00  --------     00 00 00 00 00 00 00 00  --------
00000030  65 00 00 00 00 00 00 00  e-------     66 00 00 00 00 00 00 00  f-------
00000040  FD FF FF FF FF FF FF FF  ^^^^^^^^     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000050  01 00 00 00 00 00 00 00  --------     01 00 00 00 00 00 00 00  --------
00000060  00 00 00 00 00 00 00 00  --------     00 00 00 00 00 00 00 00  --------
00000070  02 00 00 00 00 00 00 00  --------     02 00 00 00 00 00 00 00  --------
00000100  FD FF FF FF FF FF FF FF  ^^^^^^^^     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000110  06 01 00 00 00 00 00 00  --------     01 00 00 00 00 00 00 00  --------
00000120  FD FF FF FF FF FF FF FF  --------     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000130  07 00 00 00 00 00 00 00  --------     07 00 00 00 00 00 00 00  --------
00000140  61 62 63 64 65 68 74 00  abcdeht-     61 62 63 64 65 68 74 00  abcdeht-
00000150  00 00 00 00 00 00 00 00  --------     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000160  07 01 00 00 00 00 00 00  --------     01 00 00 00 00 00 00 00  --------
00000170  FF FF FF FF FF FF FF FF  ^^^^^^^^     FF FF FF FF FF FF FF FF  ^^^^^^^^
00000200  07 00 00 00 00 00 00 00  --------     07 00 00 00 00 00 00 00  --------
00000210  B0 00 00 00 00 00 00 00  *-------     B0 00 00 00 00 00 00 00  *-------
00000220  4F 00 00 00 00 00 00 00  O-------     4F 00 00 00 00 00 00 00  O-------
00000230  6A 00 00 00 00 00 00 00  j-------     6A 00 00 00 00 00 00 00  J-------
00000240  6F 00 00 00 00 00 00 00  o-------     6F 00 00 00 00 00 00 00  o-------
00000250  B8 00 00 00 00 00 00 00  ,-------     B8 00 00 00 00 00 00 00  ,-------
00000260  7D 00 00 00 00 00 00 00  }-------     7D 00 00 00 00 00 00 00  J-------
00000270  87 00 00 00 00 00 00 00  --------     8F 00 00 00 00 00 00 00  --------
00000300  00 00 00 00 00 00 00 00  --------     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000310  00 00 00 00 00 00 00 00  --------     01 00 00 00 00 00 00 00  --------
00000320  00 00 00 00 00 00 00 00  --------     00 00 00 00 00 00 00 00  --------
00000330  00 00 00 00 00 00 00 00  --------     02 00 00 00 00 00 00 00  --------
00000340  00 00 00 00 00 00 00 00  --------     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000350  06 01 00 00 00 00 00 00  --------     01 00 00 00 00 00 00 00  --------
00000360  FD FF FF FF FF FF FF FF  ^^^^^^^^     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000370  07 00 00 00 00 00 00 00  --------     07 00 00 00 00 00 00 00  --------
00000400  61 62 63 64 65 68 75 00  abcdehu-     61 62 63 64 65 68 75 00  abcdehu-
00000410  00 00 00 00 00 00 00 00  --------     FD FF FF FF FF FF FF FF  ^^^^^^^^
00000420  00 00 00 00 00 00 00 00  --------     01 00 00 00 00 00 00 00  --------
00000430  00 00 00 00 00 00 00 00  --------     FF FF FF FF FF FF FF FF  --------
00000440  00 00 00 00 00 00 00 00  --------     07 00 00 00 00 00 00 00  --------

Not clear why step 3 was able to run in Windows by restarting kona after the segfault on step 3. 
tavmem commented 2 years ago

I just wanted to verify that the prior result is repeatable on Windows, so I tried it again today:

tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l sp102.k
  "scrabblepuz" 1: puz
  puz2: 1: `"scrabblepuz"
Segmentation fault

tavme@DESKTOP-FVKENU9 MINGW64 ~/kona
$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  puz2: 1: `"scrabblepuz"
  #puz2
102
  puz2[97+!5]
(("abcfikt"
  80 36 37 23 57 42 56)
 ("abcfilo"
  125 85 93 62 82 134 113)
 ("abcfiot"
  86 62 75 33 64 87 84)
 ("abcfirs"
  213 118 111 65 148 161 185)
 ("abcfkly"
  ()))

My current theory is that

I can't explain why kona can read the Windows output of step 2 after the restart, but both hex display tools cannot (HdX and Power Shell).

BTW: I'll be away for the weekend. Probably won't be working on this till sometime next week.

tavmem commented 2 years ago

I tried it with 1 element. The result of a compare in Linux:

$ cmp s1lin.K s1win.l
s1lin.K s1win.l differ: byte 41, line 1

First of all, if the serialized data should be the same on all platforms, then why do the files end up with different extensions?

Next, as to the file content differences:

offset(d)    Linux                       Windows
00000000  FD FF FF FF FF FF FF FF     FD FF FF FF FF FF FF FF 
00000008  01 00 00 00 00 00 00 00     01 00 00 00 00 00 00 00
00000016  00 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000024  02 00 00 00 00 00 00 00     02 00 00 00 00 00 00 00 
00000032  FD FF FF FF FF FF FF FF     FD FF FF FF FF FF FF FF 
00000040  01 00 00 00 00 00 00 00     06 01 00 00 00 00 00 00 
00000048  FD FF FF FF FF FF FF FF     FD FF FF FF FF FF FF FF
00000056  07 00 00 00 00 00 00 00     07 00 00 00 00 00 00 00
00000064  61 62 63 64 65 68 74 00     61 62 63 64 65 68 74 00
00000072  FD FF FF FF FF FF FF FF     00 00 00 00 00 00 00 00 
00000080  01 00 00 00 00 00 00 00     07 01 00 00 00 00 00 00
00000088  FF FF FF FF FF FF FF FF     FF FF FF FF FF FF FF FF
00000096  07 00 00 00 00 00 00 00     07 00 00 00 00 00 00 00
00000104  B0 00 00 00 00 00 00 00     B0 00 00 00 00 00 00 00 
00000112  4F 00 00 00 00 00 00 00     4F 00 00 00 00 00 00 00
00000120  6A 00 00 00 00 00 00 00     6A 00 00 00 00 00 00 00
00000128  6F 00 00 00 00 00 00 00     6F 00 00 00 00 00 00 00
00000136  B8 00 00 00 00 00 00 00     B8 00 00 00 00 00 00 00
00000144  7D 00 00 00 00 00 00 00     7D 00 00 00 00 00 00 00
00000152  8F 00 00 00 00 00 00 00     8F 00 00 00 00 00 00 00
00000160                              00 00 00 00 00 00 00 00
00000168                              00 00 00 00 00 00 00 00
00000176                              00 00 00 00 00 00 00 00
00000184                              00 00 00 00 00 00 00 00

First note, I put the offest in decimal this time. Differences, based on this display (using HxD)

tavmem commented 2 years ago

If I try using the Windows generated file in Linux (leaving the file extension unchanged as s1win.l):

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  puz2: 1: `"s1win"
domain error
>  

If I try using the Linux generated file in Windows (leaving the file extension unchanged as s1lin.K)

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  puz2: 1: `"s1lin"
domain error
>  

If I change the file extension of the Windows file to s1win.K the file loads fine in Linux. If I change the file extension of the Linux file to s1lin.l the file loads fine in Windows.

So, using the current code, the file extensions do matter … and, raises the question: What file extension is created in MacOS?

FWIW: The file extension in MacOS is .l (just like Windows). In addition, the file generated in Windows (s1win.l) loads in MacOS:

  puz2: 1: "s1win"
  puz2
("abcdeht"
 176 79 106 111 184 125 143)```

The file generated in Linux does not load in MacOS with the extension .K

  puz2: 1: "s1lin"
domain error

The contents of file s1lin.K does load in MacOS if the file name is changed to s1lin.l

tavmem commented 2 years ago

Using k2.8 and Linux, the file extension for the serialized data file is .l Using k3.2 and Windows, the file extension for the serialized data file is .l

The only aberration is kona with Linux, where the file extension is .K

tavmem commented 2 years ago

Some progress on exposing problems in the Windows version with a simpler case:

$ cat sp1.k
puz:(("abcdeht"
  176 79 106 111 184 125 143))

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l sp1.k
  "s1win" 1: puz
   puz2: 1: `"s1win"
  \l sp1.k
  "s1win" 1: puz
Invalid argument error
"s1win" 1: puz
        ^
>

This problem does not occur in Linux ... and seems to hone in on the instability.

tavmem commented 2 years ago

This localizes at least one of the problems (in Windows) further:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  :puz: ("abcdeht"; 176 89 106 111 184 125 143)
("abcdeht"
 176 89 106 111 184 125 143)

  "s1win" 1: puz

   1: `"s1win"
("abcdeht"
 176 89 106 111 184 125 143)

   1: `"s1win"
(.()
 ())

In Windows, loading K data twice consecutively, gives different results ... and the second time, is clearly wrong.

Actually, it's worse than I first thought. If you restart Kona and attempt to load that same K data file, it fails. This seems to indicate that the first successful file load corrupted the file.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

   1: `"s1win"
(();())
tavmem commented 2 years ago

Documenting the nature and extent of the file corruption caused by loading the s1win.l file after saving it.

Offset(d)    after save               after 1st (good) load
00000000  FD FF FF FF FF FF FF FF     FD FF FF FF FF FF FF FF
00000008  01 00 00 00 00 00 00 00     01 00 00 00 00 00 00 00
00000016  00 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000024  02 00 00 00 00 00 00 00     02 00 00 00 00 00 00 00
00000032  FD FF FF FF FF FF FF FF     FD FF FF FF FF FF FF FF
00000040  01 00 00 00 00 00 00 00     40 06 86 00 00 00 00 00
00000048  FF FF FF FF FF FF FF FF     00 00 00 00 00 00 00 00
00000056  07 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000064  61 62 63 64 65 68 74 00     00 00 00 00 00 00 00 00
00000072  FD FF FF FF FF FF FF FF     00 00 00 00 00 00 00 00
00000080  01 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000088  FF FF FF FF FF FF FF FF     00 00 00 00 00 00 00 00
00000096  07 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000104  B0 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000112  59 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000120  6A 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000128  6F 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000136  B8 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000144  7D 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000152  8F 00 00 00 00 00 00 00     00 00 00 00 00 00 00 00
00000160                              00 00 00 00 00 00 00 00
00000168                              00 00 00 00 00 00 00 00
00000176                              00 00 00 00 00 00 00 00
00000184                              00 00 00 00 00 00 00 00
tavmem commented 2 years ago

Much of the available documentation on mmap highlights that it is a POSIX call and that it works fine on Linux, and MacOS, and HP-UX, and AIX, and Solaris. Trying to use it in Windows is problematic.

However, in the Windows version of k3.2 it works fine

C:\k3.2>k
K 3.2 2005-06-25 Copyright (C) 1993-2004 Kx Systems
WIN32 8CPU 3904MB desktop-fvkenu9.myfiosgateway.com 0 EVAL

  :puz: ("abcdeht"; 176 89 106 111 184 125 143)
("abcdeht"
 176 89 106 111 184 125 143)

  "s1win" 1: puz

  1: `"s1win"
("abcdeht"
 176 89 106 111 184 125 143)

  1: `"s1win"
("abcdeht"
 176 89 106 111 184 125 143)
tavmem commented 2 years ago

The problem doesn't occur all the time. It works fine on short simple cases.

 $ rlwrap -n ./k
kona      \ for help. \\ to exit.

  :puz: "abcdeht"
"abcdeht"
  "s1win" 1: puz
  1: `"s1win"
"abcdeht"
  1: `"s1win"
"abcdeht"

   :puz: 176 89 106 111 184 125 143
176 89 106 111 184 125 143
  "s1win" 1: puz
  1: `"s1win"
176 89 106 111 184 125 143
  1: `"s1win"
176 89 106 111 184 125 143
tavmem commented 2 years ago

... and on long simple structures

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  :puz: ,/  " ",' $!100
" 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99"
  "s1win" 1: puz
  1: `"s1win"
" 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99"
  1: `"s1win"
" 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99"

  :puz: !100
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
  "s1win" 1: puz
  1: `"s1win"
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
  1: `"s1win"
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

You can use 1000 instead of 100. It still works.

tavmem commented 2 years ago

Another instability issue (this time in Linux). It should not matter how many times you serialize and unserialize.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  data: ("abcdeht"; 176 89 106 111 184 125 143)
  "file" 1: data
  data: 1: `"file"
  "file" 1: data
  data: 1: `"file"

Segmentation fault (core dumped)

In Windows you get a different error (and no core dump)

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  data: ("abcdeht"; 176 89 106 111 184 125 143)
  "file" 1: data
  data: 1: `"file"
  "file" 1: data
Invalid argument error
"file" 1: data
       ^
>
tavmem commented 2 years ago

In the case of Windows, the failure is at I f=open(e,O_RDWR|O_CREAT|O_TRUNC,07777); in the function Z K _1d_write(K x,K y,I dosync) where e = file.l file.l does exist, but has been corrupted, and did not open successfully.

tavmem commented 2 years ago

In Linux, it appears to be a memory management problem. The function unpool in src/km.c assigns space for a new K construct allocation. and adds memory if needed. If we make the following change to a single line, we note this issue.

Z V unpool(I r)
{
  V*z;
  V*L=((V*)KP)+r;
-  I k= ((I)1)<<r;
+  I k= ((I)1)<<r; O("(V*)KP:%p   r:%lld   L:%p   k:%lld   *L:%p\n",(V*)KP,r,L,k,*L);
  if(!*L || (V)0x106==*L)
  {
    U(z=amem(k,r))
    if(k<PG)

Memory gets added if(!*L || (V)0x106==*L) but the segfault occurs when 0x1==*L This condition is not handled.

(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:(nil)
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x7ff8ed569040
....
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x7ff8ed5695c0
(V*)KP:0x450a00   r:9   L:0x450a48   k:512   *L:(nil)
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x7ff8ed569600
(V*)KP:0x450a00   r:7   L:0x450a38   k:128   *L:(nil)
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x7ff8ed569600
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x7ff8ed569640
...
(V*)KP:0x450a00   r:7   L:0x450a38   k:128   *L:0x7ff8ed530050
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x7ff8ed569680
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x7ff8ed531028
(V*)KP:0x450a00   r:6   L:0x450a30   k:64   *L:0x1

Segmentation fault (core dumped)
tavmem commented 2 years ago

Some interesting characteristics of the the problem.

$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  data: ("abcdeht"; 176 89 106 111 184 125 143)
  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"
  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"
  "fileC" 1: data
  data: 1: `"fileC"

Segmentation fault (core dumped)

And this also fails

$ rlwrap  -n ./k
kona      \ for help. \\ to exit.

  data: ("abcdeht"; 176 89 106 111 184 125 143)
  "file" 1: data
  data: 1: `"file"
  "file" 1: data
  2+2

Segmentation fault (core dumped)

The segfault is only caused by a 3-step consecutive process:

Then just about anything you do, might cause a segfault. If you don't do it consecutively, you can read, write and read from the same file. Furthermore, after the segfault, all the files compare. There is no file corruption.

$ cmp fileA.K fileB.K
$ cmp fileB.K fileC.K
$ cmp fileA.K fileC.K
tavmem commented 2 years ago

I repeated the process of saving and then reloading the serialized data to 3 files for 10 times. Then, I did a save, load, resave to a single file and got the segfault. What does this tell us? It implies that this is NOT a routine memory management issue. There is something about the consecutive load-save-reload sequence to a single file that causes the 0x1==*L condition ... and the segfault.


$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  data:("abcdeht"; 176 89 106 111 184 125 143)

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  "fileA" 1: data
  data: 1: `"fileA"
  "fileB" 1: data
  data: 1: `"fileB"
  "fileC" 1: data
  data: 1: `"fileC"

  data: 1: `"fileA"
  "fileA" 1: data
  data: 1: `"fileA"

Segmentation fault (core dumped)