luvit / luv

Bare libuv bindings for lua
Apache License 2.0
823 stars 185 forks source link

segfault on new_thread and worker.queue #636

Closed Bilal2453 closed 1 year ago

Bilal2453 commented 1 year ago

The following is consistently segfaulting on my machine when used with Luajit:

local uv = require 'luv'

uv.new_thread(function() end)

uv.run()

The same code is not segfaulting with Luvit's luv although. But this is:

local uv = require'uv'

local tbl = {}
for i = 1, 25 do
  tbl[i] = i
end

uv.new_thread(function() end, unpack(tbl))

uv.run()

I am not sure if the first provided code is segfaulting because of a bad build I have of Luv or what is the deal exactly with that but if you can' reproduce the first then ignore it. The main focus is the second one:

image

This is running Luvit with that code multiple times, sometimes it works fine, most of the time it segfaults, and sometimes it prints this Error: thread arg not support type [random] at 10Uncaught Error: attempt to call a [random] value.

I tried backtracing this with no luck, the second thread is sent a segfault and its backtrace was just random when trying to use gdb.

Bilal2453 commented 1 year ago

Note that there may actually be multiple things going wrong here, the first real issue is that Luv failing to enforce the LUV_THREAD_MAXNUM_ARG limit (which is 9 in this case) possibly overflowing some struct somewhere. Also a possible race condition may be happening which could actually be causing this segfault.

erw7 commented 1 year ago

https://github.com/luvit/luv/blob/b9c9b65afb71e5654b7aeec8d6ab18b35e6a1a27/src/thread.c#L68-L70

zhaozg commented 1 year ago
    local tbl = {}
    for i = 1, 25 do
      tbl[i] = i
    end

    uv.new_thread(function(...)
      local a = {...}
      io.write(#a)
      io.write("\n")
      assert(#a == 9)
    end, unpack(tbl)):join()
    uv.run()
zhaozg commented 1 year ago

Your thread exist after main uv exit, that makes mem corrupt. use thread:join() to avoid that.

Bilal2453 commented 1 year ago

Do I understand from this that blocking the main thread is required in order to let the rest work? I am not using :join here to explicitly not block the main thread, hence using threads.

erw7 commented 1 year ago

@zhaozg Your comment may be right about the first example of https://github.com/luvit/luv/issues/636#issue-1609293745. However, the part of the code I showed has the typical out of bounds problem.

For simplicity, consider the case idx=0, LUV_THREAD_MAXNUM_ARG <= top. If we modify the code in a way that is easier to understand, the relevant code will eventually look like this

  i = 0;
  while (i <= LUV_THREAD_MAXNUM_ARG)
  {
    luv_val_t *arg = args->argv[i];
    …
    i++;
  }

Obviously the last arg refers to args->argv[LUV_THREAD_MAXNUM_ARG].

By executing the following code, you can see that an impermissible tenth argument is passed to the entry function.

local uv = require'luv'

local tbl = {}
-- for i = 1, tonumber(arg[1]) do
for i = 1, 10 do
  tbl[i] = i
end

local thread = uv.new_thread(function(...)
  -- print(#{...})
  assert(#{...} <= 9)
end, unpack(tbl))
uv.thread_join(thread)

thread = uv.new_thread({stack_size = 2048}, function(...)
  -- print(#{...})
  assert(#{...} <= 9)
end, unpack(tbl))
uv.thread_join(thread)
zhaozg commented 1 year ago

while (i <= LUV_THREAD_MAXNUM_ARG) should be <, my fault, I will fix this.

Bilal2453 commented 1 year ago

I have actually already noticed this out-of-bound access, sorry I seem to have completely forgot about it as I went deeper into the code. I am actually still not sure if this is all there is into this; at least that is what I remember making me continue digging into the code. Will test this fix once I can