Closed houqp closed 4 years ago
Reduced testsuite reproducing the issue: put this into spec/1_spec.lua and spec/2_spec.lua, run busted.
for i = 1, 1000 do
loadstring([[
for j = 1, 20 do
local x = {0LL}
end
]])()
end
But this looks like a bug in LuaJIT, not busted...
Interesting, It could be a bug in LuaJIT. But we have been running the same code on many real devices for a long time any never ran into this issue. It only happens quiet frequently when running with busted.
But yeah, if it works for lua5.1 and assuming busted is not using any special magic to run those tests, then I would agree it's more likely a bug in LuaJIT.
we have seen this as well on Travis and CircleCI, where tests crashed at the end (busted with LuaJit). Probably warrants a bug report...
Sometimes the tests even crash during the execution. But this happens way less frequent.
:+1: I am experiencing this issue as well. In my case it occurs maybe 20% of the time.
did someone report this on the LuaJit mailing list?
I can report it to the list. @Tieske , just to confirm, busted is not doing multi-threading right?
No, it isn't. Please do report, otherwise I will, just wanted to make sure we don't report it multiple times.
I have sent the report to the mailing list.
:+1:
For the record, the reported issue; http://www.freelists.org/post/luajit/random-segfaults-when-running-tests-with-busted
Actually, given that I can reproduce the issue with busted 2.0.rc7 but not 2.0.rc6 this could be an issue in lua-term, a C dependency added in rc7. I'll bisect for exact commit introducing the issue.
Nevermind, commenting out lua-term import doesn't fix the problem, bisect shows that segfault is introduced in @b3993e6.
@mpeterv what version of luajit are you using? EDIT: nvm, I was able to reproduce with the latest LuaJIT.
@houqp I reproduced the issue using busted with testcase I posted above with luajit 2.0.0, 2.0.4, head of master branch, head of v2.1 branch. I've also reduced the testcase (the whole blob with busted, luarocks, penlight and other dependencies) to one file. I'll finish some more testing with different compilation flags and post it to luajit list.
So, as expected, this is a duplicate of #369. Some workarounds are posted there.
Hm... I think the problem now is the discussion in #369 is not documented. @Tieske is it possible to get this documented on the official documentation page?
For us, the workaround does not work. Most of our modules requires ffi, and it's not easier to put all of them into a helper.lua script. It's probably easier for neovim since their application code does not depend on luajit ffi. I guess a cleaner way to fix it is to run each test in its own process...
@houqp it should be enough to require just the FFI module in helper.lua, ignoring any other modules depending on it.
@mpeterv I tried that, but got a bunch of redefined errors in the cdef call (link), which is imported by some of the tests.
Error message:
Error → ffi/util.lua @ 54
suite ./spec/front/unit/nickel_conf_spec.lua
ffi/util.lua:54: attempt to redefine '_FILETIME' at line 11
stack traceback:
ffi/util.lua:54: in main chunk
@houqp so busted drops and reloads that module, too. So, yes, you'll need to add all modules that can't be safely reloaded in helper.lua.
That's why I said this workaround is not practical for us ;) Basically all our modules depends on FFI, that means we will have to put all of them in helper.lua...
As a workaround, I've had good results running busted with the option --no-auto-insulate
, e.g.
busted --lua=luajit --no-auto-insulate tests
This does sacrifice the test insulation feature of busted, but it seems to prevent the crash caused by the ffi
module being unanchored between tests.
@vsergeev thanks! I can also confirm adding --no-auto-insulate
fixes all our random segfaults :)
I suggest we add this to the documentation.
Alongside the documentation, I suggest we add a list of default preloaded libraries. We add common things that don't like to be reloaded, like ffi, by default, and allow users to add modules either in helper.lua or at the command line.
yep, that will work too, as long as the behavior is documented :)
for reference, a possible fix: https://github.com/Olivine-Labs/luassert/pull/150
closing this, reloading ffi has been resolved
Hi,
We have been having this issue for awhile. 90% of the time, it segfaults at the end of the test, inside luajit's lua_close function (most of the time it dies at
lj_gc_freeall
). It kind smells like a race condition. Does busted do multi-threading at all?