Closed georgecao closed 7 years ago
I did another simple and isolated test, and I found that if we first execute self.shm:safe_set(key,value)
then reload the nginx sbin/nginx -s reload
, the error shows up.
This time, the content of test.lua as follow:
local ffi = require 'ffi'
local C = ffi.C
local ffi_cast = ffi.cast
local ffi_str = ffi.string
local M = {}
ffi.cdef [[
size_t strlen(const char *str);
struct peer_rec {
uint64_t reconn_delay;
uint64_t unhealthy_at;
char *data_center;
char *release_version;
};
]]
local str_const = ffi.typeof('char *')
local rec_peer_const = ffi.typeof('const struct peer_rec*')
local rec_peer_size = ffi.sizeof('struct peer_rec')
local rec_peer_cdata = ffi.new('struct peer_rec')
local shared = ngx.shared
local dict = shared.ldict
function M:set(key, reconn_delay, unhealthy_at, data_center, release_version)
-- host info
rec_peer_cdata.reconn_delay = reconn_delay
rec_peer_cdata.unhealthy_at = unhealthy_at
rec_peer_cdata.data_center = ffi_cast(str_const, data_center)
rec_peer_cdata.release_version = ffi_cast(str_const, release_version)
local ok , err = dict:safe_set(key, ffi_str(rec_peer_cdata,rec_peer_size))
if not ok then
print(err)
end
end
function M:rm(key)
dict:delete(key)
end
function M:get(key)
local rec_v, err = dict:get(key)
if err then
return nil, 'Not' .. err
elseif not rec_v then
return nil, 'Key' .. key
elseif type(rec_v) ~= 'string' or #rec_v ~= rec_peer_size then
return nil, 'Bad'
end
ngx.log(ngx.ERR, 'key', key)
local peer = ffi_cast(rec_peer_const, rec_v)
-- do return end
--local len = C.strlen(peer.data_center)
local data_center = ffi_str(peer.data_center, C.strlen(peer.data_center))
local release_version = ffi_str(peer.release_version, C.strlen(peer.release_version))
local reconn_delay = tonumber(peer.reconn_delay)
local unhealthy_at = tonumber(peer.unhealthy_at)
return data_center, release_version, reconn_delay, unhealthy_at
end
return M
Then I can reproduce this by the following code and a sbin/nginx -s reload
command :
local m = require 'test'
local key = 'the_key_does_not_matter'
local dc, rv = m:get(key)
m:set(key, 100,100,'ct','1.11')
ngx.say('{"ffi":1, "dc":"'..tostring(dc)..'"}')
Hi!
Thanks for investigating this and sorry for the lack of response. I currently don't have time to look into it but will in the coming days or weeks, sorry!
glibc-2.12-1.192.el6.x86_64 contos6.5
Gave some thought to that, and I think that we should get it of this strlen
call and instead, store the length of the string values we store into the C struct, just like the ngx_string
type in the Nginx codebase.
I'm not sure when I'll be able to do that, but would welcome a contribution for it!
local cdata = {} cdata.reconn_delay = reconn_delay cdata.unhealthy_at = unhealthy_at cdata.data_center = data_center cdata.release_version = release_version ok, err = self.shm:safe_set(_rec_key .. host, json.encode(cdata))
@thibaultcha @swordkee In my local repo, I implement a serde interface to make this configurable. Currently support ffi and plain table concat. And use table concat by default, So the problem is not solved, it's just a workaround.
return setmetatable({
shm = shared[dict_name],
dict_name = dict_name,
prepared_ids = {},
peers_opts = peers_opts,
keyspace = opts.keyspace,
default_port = opts.default_port or 9042,
contact_points = opts.contact_points or { '127.0.0.1' },
timeout_read = opts.timeout_read or 2000,
timeout_connect = opts.timeout_connect or 1000,
retry_on_timeout = opts.retry_on_timeout == nil and true or opts.retry_on_timeout,
max_schema_consensus_wait = opts.max_schema_consensus_wait or 10000,
lock_opts = lock_opts,
logging = not opts.silent,
lb_policy = opts.lb_policy
or require('resty.cassandra.policies.lb.rr').new(),
reconn_policy = opts.reconn_policy
or require('resty.cassandra.policies.reconnection.exp').new(1000, 60000),
retry_policy = opts.retry_policy
or require('resty.cassandra.policies.retry.simple').new(3),
serde = opts.serde
or require('resty.cassandra.serde.ffi').new()
}, _Cluster)
end
Use cjson is another option as @swordkee said.
And FFI struct has any advantages over table or JSON string?
the cjson is of generality
Invalid message version. Got 4 but previous messages on this connection had version 3
local function check_peer_health(self, host, coordinator_options, retry) ... local peer, err = spawn_peer(host, self.default_port, keyspace, self.peers_opts) ngx.say(peer.protocol_version) ... end peer.protocol_version=4 True version =2.18
return setmetatable({ shm = shared[dict_name], ..... init = false, protocol_version = nil or opts.protocol_version,-- Force version number .... }, _Cluster) end
function _Cluster:refresh() .... local coordinator, err = first_coordinator(self) if not coordinator then return nil, err end
if self.protocol_version ~= nil and self.protocol_version ~= coordinator.protocol_version then
coordinator.protocol_version = self.protocol_version
end
......
This should be taken care of in #97, thanks for the report!
I am using lua-cassandra 1.1.1 with openresty
And in cluster.lua file,
The code around line 103 C.strlen(peer.data_center) cause a nginx error, which says:
2017/04/05 10:07:11 [alert] 127063#0: worker process 84466 exited on signal 11 (core dumped)
I am use
do return end
narrow down to the code. And this happens every time.Then I did some test, the same way I load lua-cassandra client.
And the content of test.lua as follow:
This time nginx works fine, and log Length Type: cdata as expected.
Any help?