Open ethnext opened 8 years ago
@ethnext Are you using any 3rd-party Lua C modules or 3rd-party nginx C modules that are not maintained by OpenResty? It seems like a memory corruption of Lua string objects.
Also, please ensure you are using the latest OpenResty version.
we are using:
I believe everything else is lua.
however, we are on an older resty-redis version (0.20). I will update and see if this persists.
cjson seems like a potential cause, being a c extension that is used on the get result (I haven't checked to see if the string is altered before / after the cjson call).
@ethnext You know that the socket.http
module from the LuaSocket library is blocking horribly in the context of OpenResty?
@ethnext It would be much easier if you can provide a minimal and standalone example that can (relatively) reliably reproduce the issue on our side :) Please see
http://openresty.org/en/faq.html#how-should-i-report-a-problem
@ethnext Also, please always provide the version numbers of the related software you're using (OpenResty, nginx, ngx_lua, lua-resty-redis, operating system, and etc).
socket.http is used asynchronously in a ngx.timer call ... not sure if its blockingness is still a problem in that context. still, that shouldn't affect this current issue, should it?
resty redis is 0.20
$ resty -V resty 0.01 nginx version: openresty/1.7.7.2 built by gcc 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
OS is Ubuntu 12.04.4 LTS
I will try to find a minimal way to reproduce, though previous attempts in that direction (making the same request that is known to fail on occasion over and over) have seemed to cause heisenbug behavior, and it just succeeds every time.
socket.http always blocks even in ngx.timer.at.
If socket.http has memory corruptions, it can surely lead to the problem your are seeing now.
my code looks something like
intermittently, the
redis:get
call will return a string in which the beginning is replaced by a substring that appears later in the string. that is, if the correct result is:one two three four five six seven eight nine ten
the corrupted result might look like:six seven eight nineive six seven eight nine ten
the resulting string is the same length as the correct string.the actual result is much longer than this conceptual example. in the instance I have been able to concretely debug, the size of the result is 10078 bytes. the first 4076 bytes have been replaced by a duplicate of bytes 4084 to 8159 of the correct result. bytes 4076 to the end (10077) are correct.
I am not certain how generally this description holds. I will try to gather more data from corrupted calls. the intermittent nature of this complicates things slightly.