lunarmodules / luasocket

Network support for the Lua language
http://lunarmodules.github.io/luasocket/
MIT License
1.86k stars 633 forks source link

Segfault in timeout_markstart #179

Open TheCycoONE opened 8 years ago

TheCycoONE commented 8 years ago

https://github.com/CorsixTH/CorsixTH/issues/1091 was posted in CorsixTH today that looks to originate in luasocket (as packaged for Debian, awaiting confirmation on the version).

CorsixTH Code:

https://github.com/CorsixTH/CorsixTH/blob/b6fc9f9e0c6893720200088d21da4e2e14f0d319/CorsixTH/Lua/app.lua#L1448

Stacktrace:

#0  0x00007fffd40e01e1 in timeout_markstart () from /usr/lib/x86_64-linux-gnu/lua/5.2/socket/core.so
#1  0x00007fffd40e0599 in buffer_meth_send () from /usr/lib/x86_64-linux-gnu/lua/5.2/socket/core.so
#2  0x00007ffff7688c4d in luaD_precall (L=L@entry=0x745b380, func=<optimized out>, func@entry=0x7392af0, nresults=nresults@entry=-1) at ldo.c:319
#3  0x00007ffff769483d in luaV_execute (L=L@entry=0x745b380) at lvm.c:709
#4  0x00007ffff7688fc0 in luaD_call (L=L@entry=0x745b380, func=<optimized out>, nResults=nResults@entry=-1, allowyield=allowyield@entry=1) at ldo.c:402
#5  0x00007ffff768516d in lua_pcallk (L=0x745b380, nargs=<optimized out>, nresults=-1, errfunc=<optimized out>, ctx=<optimized out>, k=<optimized out>) at lapi.c:962
#6  0x00007fffd40e479f in ?? () from /usr/lib/x86_64-linux-gnu/lua/5.2/socket/core.so
#7  0x00007ffff7688c4d in luaD_precall (L=L@entry=0x745b380, func=<optimized out>, func@entry=0x73929d0, nresults=nresults@entry=3) at ldo.c:319
#8  0x00007ffff769483d in luaV_execute (L=L@entry=0x745b380) at lvm.c:709
#9  0x00007ffff7688e40 in unroll (L=0x745b380, ud=<optimized out>) at ldo.c:442
#10 0x00007ffff76885cf in luaD_rawrunprotected (L=L@entry=0x745b380, f=f@entry=0x7ffff7688e60 <resume>, ud=0x2258060) at ldo.c:131
#11 0x00007ffff7689027 in lua_resume (L=0x745b380, from=<optimized out>, nargs=<optimized out>) at ldo.c:543
#12 0x0000000000451760 in luaT_resume (L=0x745b380, f=0x745b380, n=1) at CorsixTH/CorsixTH/Src/th_lua.h:114
#13 0x0000000000451103 in l_mainloop (L=0x700db0) at CorsixTH/CorsixTH/Src/sdl_core.cpp:260
#14 0x00007ffff7688c4d in luaD_precall (L=L@entry=0x700db0, func=<optimized out>, func@entry=0x7160f50, nresults=nresults@entry=2) at ldo.c:319
#15 0x00007ffff769483d in luaV_execute (L=L@entry=0x700db0) at lvm.c:709
#16 0x00007ffff7688f8e in luaD_call (L=L@entry=0x700db0, func=<optimized out>, nResults=nResults@entry=-1, allowyield=allowyield@entry=0) at ldo.c:402
#17 0x00007ffff768502b in lua_callk (L=0x700db0, nargs=<optimized out>, nresults=-1, ctx=<optimized out>, k=<optimized out>) at lapi.c:905
#18 0x0000000000452ac4 in CorsixTH_lua_main (L=0x700db0) at CorsixTH/CorsixTH/Src/main.cpp:179
grote commented 8 years ago

The segfault occurs with this version 3.0~rc1+git+321c0c9-1 as made available here: https://packages.debian.org/sid/lua-socket

diegonehab commented 8 years ago

Can you narrow the problem down to something smaller?

TheCycoONE commented 8 years ago

How would you suggest doing that?

diegonehab commented 8 years ago

By creating a .lua file with no dependencies that I can run and that crashes. I don't know what CorsixTH is or how it uses Lua or LuaSocket.

TheCycoONE commented 8 years ago

@grote would you be able to see if you can reproduce this, copying just the code from App:checkForUpdates relevant to luasocket and running it with lua on your system?

Something like:

  local update_url = 'http://www.corsixth.com/check-for-updates'

  local success, socket = pcall(require, "socket")
  local http = require "socket.http"
  local url = require "socket.url"

  local update_body, status, headers = http.request(update_url)
  url.parse('http://corsixth.com/downloads.html')
grote commented 8 years ago

I don't know nothing about lua, but if I put this code into a file and run it with lua test.lua I get no output.

TheCycoONE commented 8 years ago

Ok, I don't know then. I guess I'll have to try out a debian vm and see if I can reproduce; and if I can, whether I can reproduce with debug builds so we can tell what's going on. CorsixTH doesn't do anything fancy with luasocket, it just downloads the table at that url, sanity checks it, and displays the result if necessary.

diegonehab commented 8 years ago

Don't you have to print something to see something? url.parse doesn't print anything. Try

local update_url = 'http://www.corsixth.com/check-for-updates'

local success, socket = pcall(require, "socket")
local http = require "socket.http"
local url = require "socket.url"

local update_body, status, headers = http.request(update_url)
print(update_body)
print(status)
for i,v in pairs(headers) do print(i,v) end
grote commented 8 years ago

No problem here:

return {
  major = 0,
  minor = 6,
  revision = 0,
  changelog_en = 'The latest version of CorsixTH features - among other things - an integrated Map Editor, custom campaigns, drug price impact, and as usual a lot of bug fixes.',
  download_url = 'http://corsixth.com/downloads.html',
}

200
etag    "a42322-11f-8255d80"
content-type    text/plain; charset=UTF-8
content-length  287
last-modified   Wed, 01 Jun 2016 15:20:38 GMT
connection      close
date    Sat, 18 Jun 2016 16:04:01 GMT
set-cookie      __cfduid=d03bf3bb0b4374095f7fbabed2d336c941466265840; expires=Sun, 18-Jun-17 16:04:00 GMT; path=/; domain=.corsixth.com; HttpOnly
server  cloudflare-nginx
accept-ranges   bytes
cf-ray  2b4ff98143484c24-GRU```
emorrp1 commented 8 years ago

I have been able to reproduce this in a Debian sid VM, with lua-socket version 3.0~rc1+git+321c0c9-1 (created 2015-08-14) which appears to have been pulled from https://github.com/diegonehab/luasocket/commit/321c0c9b1f7b6b83cd83b58e7e259f53eca69373 (created 2015-03-03). Reverting to version 3.0~rc1-3 (created 2013-08-12) pulled from https://github.com/diegonehab/luasocket/releases/tag/v3.0-rc1 (created 2013-06-04) replaces the segfault with this error message on update check:

---------------------------------------------------------------

Welcome to CorsixTH v0.60!

---------------------------------------------------------------

This window will display useful information if an error occurs.

---------------------------------------------------------------

Checking for CorsixTH updates...
An error has occurred!
Almost anything can be the cause, but the detailed information below can help the developers find the source of the error.
Running: The movie_over handler.
A stack trace is included below, and the handler has been disconnected.
/usr/share/lua/5.2/socket/http.lua:189: use of undeclared variable 'PROXY'
stack traceback:
        [C]: in function 'request'
        ./Lua/app.lua:1448: in function 'checkForUpdates'
        ./Lua/app.lua:463: in function 'loadMainMenu'
        ./Lua/app.lua:293: in function 'callback_on_destroy_movie'
        ./Lua/movie_player.lua:260: in function '_destroyMovie'
        ./Lua/movie_player.lua:229: in function 'onMovieOver'
        ./Lua/app.lua:1078: in function <./Lua/app.lua:1077>
        (...tail calls...)
        ./Lua/app.lua:901: in function <./Lua/app.lua:896>

Again, I don't know anything about lua, but I can run any tests you like.

TheCycoONE commented 8 years ago

The PROXY error is known and happens on all platforms. Could you try with latest master instead.

emorrp1 commented 8 years ago

Ok, so I tried with latest master, got segfault on CorsixTH's update check, then I bisected:

1f9ccb2b586c3a7e29db3c99a23ac1cee6907cf2 is the first bad commit
commit 1f9ccb2b586c3a7e29db3c99a23ac1cee6907cf2
Author: Pierre Chapuis <catwell@archlinux.us>
Date:   Fri Jul 5 18:00:29 2013 +0200

    http: look for PROXY in _M, not as a global

:040000 040000 6f8b5a2fb0660ef15703297b8dee22eddd54d178 8331b987f3702d4ce0053625e4b5a94549df62c7 M  src

I know it's an ancient version, but hope that helps.

diegonehab commented 8 years ago

What was the conclusion here?

TheCycoONE commented 8 years ago

I'm not sure, that's a strange bisect - we weren't looking for the PROXY error (which is already fixed here), we were looking for the segfault error.

ecrips commented 8 years ago

This seems to be caused by a conflict between the buffer_init function of libtwolame and the one provided by luasocket.

Putting a breakpoint on buffer_init I see the following backtrace:

#0  0x00007fffed3e0420 in buffer_init ()
   from /usr/lib/x86_64-linux-gnu/libtwolame.so.0
#1  0x00007fffd40ccfb5 in ?? ()
   from /usr/lib/x86_64-linux-gnu/lua/5.2/socket/core.so
#2  0x00007ffff7687c75 in ?? () from /usr/lib/x86_64-linux-gnu/liblua5.2.so.0

Here you can see luasocket calling into buffer_init from libtwolame. A simple hack to src/buffer.h makes the problem go away:

--- luasocket-3.0~rc1+git+321c0c9.orig/src/buffer.h
+++ luasocket-3.0~rc1+git+321c0c9/src/buffer.h
@@ -34,6 +34,8 @@ typedef struct t_buffer_ {
 } t_buffer;
 typedef t_buffer *p_buffer;

+#define buffer_init foo_bar_buffer_init
+
 int buffer_open(lua_State *L);
 void buffer_init(p_buffer buf, p_io io, p_timeout tm);
 int buffer_meth_send(lua_State *L, p_buffer buf);

The root cause looks like a patch that Debian has added. Dropping that patch also seems to have fixed the problem.

diegonehab commented 8 years ago

This is odd. If the library has been built correctly, the reference to bufferinit should have been resolved internally, no? Or is that function marked for export because some other dependency (like LuaSec) needs it? Either way, I could try to prefix every symbol with ls. This would break LuaSec, though.

ecrips commented 8 years ago

I'm not familiar with how LuaSec integrates with luasocket. Debian have a patch they apply which modifies various functions (including buffer_init) to have different linkage. This would allow e.g. LuaSec to use (or override) the implementation of those functions. However it looks like LuaSec contains it's own implementation of those functions so I've no idea why the patch is needed (or indeed how this works on distributions other than Debian which don't seem to include a similar patch). Prefixing the function names would reduce the possibility of conflicts but would obviously cause compatibility problems with any other code using those names, but if there is a good reason for the linkage change that would seem to be the best option.

I've commented on the Debian bug (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=822825) - hopefully the Debian maintainer will be able to shed some light on the purpose of the patch currently applied.

diegonehab commented 8 years ago

Nothing there yet, eh? :(

ewestbrook commented 5 years ago

Any news, new information, or anything to be done here?

We're reviewing open items in preparation for a release. If action is needed here, please add a comment. Otherwise, this issue will be closed on or after 24-Feb-2019.

Thanks!

TheCycoONE commented 5 years ago

I haven't heard any complaints about this in years, but I also don't use Debian and the CorsixTH dpkg disables luasocket.

As far as I'm concerned this isn't your bug anyway.

Manawyrm commented 4 years ago

I think I stepped on this bug again today :disappointed: When using lighttpd (a common webserver) with mod_magnet (basically lua CGI) this can be reproduced very easily.

This is the only code needed to trigger the bug:

http = require("socket.http")

r, c = http.request {
    url = "https://tbspace.de",
}

lighty.content = { r }
return 200

I'll go ask in the Debian Bugtracker for more info. It still happens in Debian sid at 10 Feb 2020 and in the current stable release (currently buster).

diegonehab commented 4 years ago

The solution to the problem remains the same. I don't mind that people modify LuaSocket to export internal symbols I never intended to be exported. However, if you are going to do that, please make sure the LuaSocket implementation uses the internal symbols it exported, rather than importing them from somewhere else, either by prefixing the symbols with something unique or, say, by using -Bsymbolic when linking LuaSocket.