arcapos / luapgsql

Lua binding for PostgreSQL
108 stars 24 forks source link

Support binary data #37

Open yurevich1 opened 8 years ago

yurevich1 commented 8 years ago

It will be very usefull if you would add support binary data to retrieve, for instance, number values. I found the code below useful for me.

In this case I use getvalue_binary to retrieve long long value from database. It is needed to add support for simple int, float, double values, etc. This is just a scatch. Futher more, update static int conn_execParams(lua_State *L) method. I have not a lot of time to unserstand how you feed the data to this method.

static int
conn_exec_binary(lua_State *L)
{
    PGresult **res;

    res = lua_newuserdata(L, sizeof(PGresult *));
    *res = PQexecParams(pgsql_conn(L, 1), luaL_checkstring(L, 2),
                       0,       /* one param */
                       NULL,    /* int8[] OID */
                       NULL,
                       NULL,
                       NULL,
                       1);      /* ask for binary results */
    luaL_getmetatable(L, RES_METATABLE);
    lua_setmetatable(L, -2);
    return 1;
}

use

{ "exec", conn_exec },
{ "exec_binary", conn_exec_binary },
{ "getvalue_binary", res_getvalue_binary },

instead of

{ "exec", conn_exec },

and


uint64_t
ntoh64(const uint64_t *input)
{
    uint64_t rval;
    uint8_t *data = (uint8_t *)&rval;

    data[0] = *input >> 56;
    data[1] = *input >> 48;
    data[2] = *input >> 40;
    data[3] = *input >> 32;
    data[4] = *input >> 24;
    data[5] = *input >> 16;
    data[6] = *input >> 8;
    data[7] = *input >> 0;

    return rval;
}

uint64_t
hton64(const uint64_t *input)
{
    return (ntoh64(input));
}
static int
res_getvalue_binary(lua_State *L)
{
    lua_pushnumber(L, ntoh64(((uint64_t *)
        PQgetvalue(*(PGresult **)luaL_checkudata(L, 1, RES_METATABLE),
        luaL_checkinteger(L, 2) - 1, luaL_checkinteger(L, 3) - 1))));
    return 1;
}

An example of usage:

local = sql = string.format(" SELECT d1, d2, type FROM ddates WHERE d1 < %s ORDER BY id DESC LIMIT 1 " , tostring(os.time() * 1000))
res = conn:exec_binary(sql);
local t = res:getvalue_binary(1,1);

For small amount of rows it doesn't matter to use binary of text data. But for a lot of tons of content it can give emprovment.

mbalmer commented 8 years ago

It will be very usefull if you would add support binary data to retrieve, for instance, number values. I found the code below useful for me.

I understand what you want. While your code demonstrates the idea, it is certainly not fit to be included in luapgsql in this form.

In this case I use getvalue_binary to retrieve long long value from database. It is needed to add support for simple int, float, double values, etc. This is just a scatch. Futher more, update static int conn_execParams(lua_State *L) method. I have not a lot of time to unserstand how you feed the data to this method.

static int conn_exec_binary(lua_State _L) { PGresult *_res;

res = lua_newuserdata(L, sizeof(PGresult *));
*res = PQexecParams(pgsql_conn(L, 1), luaL_checkstring(L, 2),
                   0,       /* one param */
                   NULL,    /* int8[] OID */
                   NULL,
                   NULL,
                   NULL,
                   1);      /* ask for binary results */
luaL_getmetatable(L, RES_METATABLE);
lua_setmetatable(L, -2);
return 1;

} use

{ "exec", conn_exec }, { "exec_binary", conn_exec_binary }, { "getvalue_binary", res_getvalue_binary }, instead of

{ "exec", conn_exec }, and

uint64_t ntoh64(const uint64_t input) { uint64_t rval; uint8_t data = (uint8_t *)&rval;

data[0] = *input >> 56;
data[1] = *input >> 48;
data[2] = *input >> 40;
data[3] = *input >> 32;
data[4] = *input >> 24;
data[5] = *input >> 16;
data[6] = *input >> 8;
data[7] = *input >> 0;

return rval;

}

uint64_t hton64(const uint64_t _input) { return (ntoh64(input)); } static int res_getvalue_binary(lua_State L) { lua_pushnumber(L, ntoh64(((uint64_t ) PQgetvalue(_(PGresult **)luaL_checkudata(L, 1, RES_METATABLE), luaL_checkinteger(L, 2) - 1, luaL_checkinteger(L, 3) - 1)))); return 1; } An example of usage:

local = sql = string.format(" SELECT d1, d2, type FROM ddates WHERE d1 < %s ORDER BY id DESC LIMIT 1 " , tostring(os.time())) res = conn:exec_binary(sql); local t = res:getvalue_binary(1,1); For small amount of rows it doesn't matter to use binary of text data. But for a lot of tons of content it can give emprovment.

Do you have any numbers or benchmarks that back this claim?

daurnimator commented 8 years ago

I guess the correct solution would be to use PQftype followed by the correct lua_push* function.

mbalmer commented 8 years ago

Am 17.05.2016 um 12:53 schrieb daurnimator notifications@github.com:

I guess the correct solution would be to use PQftype followed by the correct lua_push* function.

Exactly. getvalue() and __index() could check the type, and somehow execParams must accept whether to accept binary results or not. And this is the hard part...

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/arcapos/luapgsql/issues/37#issuecomment-219684283

daurnimator commented 8 years ago

Why shouldn't it always be binary? On 17 May 2016 20:55, "Marc Balmer" notifications@github.com wrote:

Am 17.05.2016 um 12:53 schrieb daurnimator notifications@github.com:

I guess the correct solution would be to use PQftype followed by the correct lua_push* function.

Exactly. getvalue() and __index() could check the type, and somehow execParams must accept whether to accept binary results or not. And this is the hard part...

— You are receiving this because you commented. Reply to this email directly or view it on GitHub < https://github.com/arcapos/luapgsql/issues/37#issuecomment-219684283>

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/arcapos/luapgsql/issues/37#issuecomment-219684635

mbalmer commented 8 years ago

Am 17.05.2016 um 13:08 schrieb daurnimator notifications@github.com:

Why shouldn't it always be binary?

In many cases, binary data can actually be larger, I think.

E.g. numbers with less digits than 8, small strings etc.

I first want to see some figures to see if binary results are actually significantly faster before hacking that up.

And then, execParams() would beed a change that is incompatible with previous versions if we had to support a „binary results“ flag. I’d rather avoid that, if possible.

yurevich1 commented 8 years ago

This simple benchmark

require ('tools') -- for my epochtime() milliseconds C-function
local DELAY_RISE = 60000
local DELAY_TEST = 1000
local pgsql = require ('pgsql')
local conn = pgsql.connectdb('user=muser password=pwd dbname = dbtest')
if conn:status() == pgsql.CONNECTION_OK then
    print('connection is ok')
else
    print('connection is not ok')
    print(conn:errorMessage())
end
local MAX = 10000
--CREATE TABLE dates(id bigserial NOT NULL, d1 bigint, d2 bigint, sign int);
local sql = string.format("SELECT d1 FROM dates LIMIT 1000")
local t0 = epochtime()
local res = conn:exec(sql)
local t0_0 = epochtime()
local nfields = res:nfields()
local nrows = res:ntuples()
for n=1, MAX do
    for j=1, nrows do
        for i=1, nfields do
            local val = res:getvalue(j,i)
        end
    end
end
local t1 = epochtime()
print("TEXT RESULTS ", (t1 - t0_0))

res:__gc()

local t2 = epochtime()
local resb = conn:exec_binary(sql);
local t2_t = epochtime()
local nfieldsb = resb:nfields()
local nrowsb = resb:ntuples()
for n=1, MAX do
    for j=1, nrowsb do
        for i=1, nfieldsb do
            local val = resb:getvalue_binary(j,i)
        end
    end
end
local t3 = epochtime()
print("BINARY RESULTS ", (t3 - t2_t))
resb:__gc()

local t0_s = epochtime()
for n=1, MAX do
    local res = conn:exec(sql)
    res:__gc()
end
local t1_s = epochtime()
print("TEXT REQUEST ", (t1_s - t0_s))

local t2_s = epochtime()
for n=1, MAX do
    local resb_s = conn:exec_binary(sql);
    resb_s:__gc()
end
local t3_s = epochtime()
print("BINARY REQUEST ", (t3_s - t2_s))

Gives results in milliseconds. The lesser the better:

TEXT RESULTS    1710
BINARY RESULTS  1391
TEXT REQUEST    4164
BINARY REQUEST  4064

That is, binary is 20% faster while data retrieving and no difference between text and binary request.

mbalmer commented 7 years ago

Maybe we need a control function on the connection object, e.g. conn:requestBinaryResults(boolean)?

yurevich1 commented 7 years ago

@mbalmer , may be. Imho, this is just a codding style. But I think that easy-to-read code is a code where a programmer directly calls a function names 'ShowMeBinaryResults' rather than previously set up option 'binary'.

mbalmer commented 7 years ago

I discussed this on IRC with @daurnimator, we bot think the best solution is to keep the current function signature, but slightly extend it:

The arguments to execParams and friends are normall the SQL statetement followed by the parameters. If we extend that so that the parameters can be passed as a table, then a after that table a boolean can passed as third argument, indicating binary or text.

res = db:execParams('select ... $1 ..., $2 ...', a, b)

or then new form

res = db:execParams('select ... $1 ..., $2 ...', {a, b}) -- default to text

then request binary results:

res = db:execParams('select ... $1 ..., $2 ...', {a, b}, true)

daurnimator commented 7 years ago

I'm not so sure any more.... I'm now thinking some sort of 'switch' is in order. e.g.

-- by default it returns text for compat
db:returnStyle("text")
db:returnStyle("binary") -- now always returns binary
db:returnStyle("arg") -- now 3rd arg of execParams is "text" vs "binary"
db:returnStyle(nil) -- set back to default
mbalmer commented 7 years ago

and yes, I think the binary indication should be part of the command execution functions, not a switch. and the way I proposed (table plus boolean) is easy to implement and remains compatible with the current API.