openresty / luajit2

OpenResty's Branch of LuaJIT 2
https://luajit.org/luajit.html
Other
1.25k stars 202 forks source link

string.find returns random values #47

Open piotrp opened 5 years ago

piotrp commented 5 years ago

Under some circumstances string.find returns random values from memory. I managed to create a short script to reproduce this issue.

Script:

jit.opt.start("hotloop=1")

local iteration = 0
local function test(value)
  iteration = iteration + 1
  --print("a") -- printing fixes wrong behavior
  local pos_c = string.find(value, "c", 1, true)
  --assert(pos_c == 3, "pos=" .. pos_c)
  --pos_c = 3 -- use of constant fixes wrong behavior
  local value2 = string.sub(value, 1, pos_c - 1)
  local pos_b = string.find(value2, "b", 2, true)
  assert(pos_b == 2, "dot1=" .. pos_b)
end

local function test_loop()
  for _ = 1, 20 do
    test("abc")
  end
end

test("abc")

local _, err = pcall(test_loop)
print(iteration)
print(err or "ok")

Results of running this script via luajit program on various environments:

Results are somewhat random when run in OpenResty, but are 100% reproducible when run from commandline. Looks like compilation breaks something.

fsfod commented 5 years ago

This happens with plain LuaJIT as well, only -O-fold stops it out of all the optimization levels. It looks like the wrong string pointer is being used for pointer arithmetic for the second string.find\lj_str_find result thats why your getting random values.

0026 rax      p32 CALLN  lj_str_find  ([0x164cf478] [0x164e0238] +4   +1  )
0027       >  p32 NE     0026  NULL
0028 [28]     int SUB    0026  [0x164cf478]
0029 rbx      int ADD    0028  +1  
....              SNAP   #2   [ test.lua:15|0001 ---- ---- 0001 test.lua:4|"abc\000" 0029 0029 ]
0030       >  p32 HREFK  0019  "sub" @15
0031       >  fun HLOAD  0030
0033       >  fun EQ     0031  string.sub
0034       >  int ULE    0028  +4  
0035       >  int GE     0028  +0  
0036 [20]     str SNEW   [0x164cf478]  0028
0037       >  int UGE    0028  +1  
0038 rcx      p32 STRREF 0036  +1  
0039 r8       int ADD    0028  -1  
0043 rax      p32 CALLN  lj_str_find  (0038 [0x164ccb80] 0039 +1  )
0044       >  p32 NE     0043  NULL
0045 rax      int SUB    0043  [0x164cf478]
0046 rax      int ADD    0045  +1  
sergos commented 5 years ago

The problem is propagator (apparently - folder) makes a wrong transformation in 0072 substituting the newly created string pointer (0051) with original string [0x40002498] so that the difference between them apparently vary from run to run:

==== 0051 str SNEW [0x40002498] 0032 0052 tab FLOAD test.lua:7 func.env 0053 int FLOAD 0052 tab.hmask 0054 > int EQ 0053 +63 0055 p32 FLOAD 0052 tab.node 0056 > p32 HREFK 0055 "string" @59 0057 > tab HLOAD 0056 0058 int FLOAD 0057 tab.hmask 0059 > int EQ 0058 +15 0060 p32 FLOAD 0057 tab.node 0061 > p32 HREFK 0060 "find" @1 0062 > fun HLOAD 0061 0063 > fun EQ 0062 string.find 0064 int FLOAD 0051 str.len 0065 > int UGE 0064 +1 ==== 0066 p32 STRREF 0051 +1 0067 int ADD 0064 -1 0068 int FLOAD "b" str.len ==== 0072 p32 CALLN lj_str_find (0066 [0x400085b8] 0067 0068) 0073 > p32 NE 0072 NULL !!!! 0074 int SUB 0072 [0x40002498]

sergos commented 5 years ago

The culprit is LJFOLDF(kfold_strref_snew) Apparently, there should be a check that the strref (new_string, b) is the only use of the new string. And I'm too rookie to fix this at the moment :)

sergos commented 5 years ago

Copied into https://github.com/LuaJIT/LuaJIT/issues/505 Proposed fix is:


diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
index 849d7a2..e879c2a 100644
--- a/src/lj_ffrecord.c
+++ b/src/lj_ffrecord.c
@@ -949,8 +949,9 @@ static void LJ_FASTCALL recff_string_find(jit_State *J, RecordFFData *rd)
                    str->len-(MSize)start, pat->len)) {
       TRef pos;
       emitir(IRTG(IR_NE, IRT_PGC), tr, trp0);
-      pos = emitir(IRTI(IR_SUB), tr, emitir(IRT(IR_STRREF, IRT_PGC), trstr, tr0));
-      J->base[0] = emitir(IRTI(IR_ADD), pos, lj_ir_kint(J, 1));
+      /* Caveat: can't use STRREF trstr 0 here because that might be pointing into a wrong string due to folding. */
+      pos = emitir(IRTI(IR_SUB), tr, trsptr);
+      J->base[0] = emitir(IRTI(IR_ADD), pos, emitir(IRTI(IR_ADD), trstart, lj_ir_kint(J, 1)));
       J->base[1] = emitir(IRTI(IR_ADD), pos, trplen);
       rd->nres = 2;
     } else {
agentzh commented 5 years ago

Thank you all! We'll look into this.

piotrp commented 3 years ago

Seems to be fixed in current version (from OpenResty 1.19.3.1), or at least my test case passes now.