estraier / tkrzw

a set of implementations of DBM
Apache License 2.0
169 stars 21 forks source link

tkrzw_langc.h use in pure C app segfaults on calling tkrzw_dbm_open() #47

Closed l8gravely closed 5 months ago

l8gravely commented 5 months ago

Hi, I'm trying to use the C Language interface to tkrzw on Debian Linux, using the packaged version for Debian 12 (Buster). I keep getting a segfault in my code no matter what I do:. Basically I have a function which passes in a DB name and then I call:

    struct db *db;
    int compress = 0;

    db->hdb = tkrzw_dbm_open(
        path_db, true, "dbm=HashDBM,truncate=true");
    printf("created empty %s db\n",path_db);

    size_t vall;
    char *version = db_get(db, "duc_db_version", 14, &vall);
    if (version) {
            if(strcmp(version, DUC_DB_VERSION) != 0) {
                    *e = DUC_E_DB_VERSION_MISMATCH;
                    goto err2;
            }
            free(version);
    } else {
            db_put(db, "duc_db_version", 14, DUC_DB_VERSION, strlen(DUC_DB_VERSION));
    }

    return db;

But it keeps crashing. Any suggestions on what I can do here? Do I need to fix my linking and building of the tool? I'm using the following libtkrzw versions:

$ dpkg-query -l | grep tkrzw ii libtkrzw-dev:amd64 1.0.25-1 amd64 set of implementations of DBM - development headers ii libtkrzw1:amd64 1.0.25-1 amd64 set of implementations of DBM - shared library ii tkrzw-doc 1.0.25-1 all set of implementation of DBM - docs ii tkrzw-utils 1.0.25-1 amd64 set of implementations of DBM - utilities

When I try to run the code under GDB, I get a back trace which doesn't really help me:

gdb --args ./duc index -d test.tkh . GNU gdb (Debian 13.1-3) 13.1 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./duc... (gdb) run Starting program: /home/john/src/duc-2020/duc/duc index -d test.tkh . [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault. 0x000055555555ba62 in db_open (path_db=path_db@entry=0x55555559c730 "test.tkh", flags=flags@entry=6, e=e@entry=0x555555594478) at src/libduc/db-tkrzw.c:47 47 db->hdb = tkrzw_dbm_open( (gdb) bt

0 0x000055555555ba62 in db_open

(path_db=path_db@entry=0x55555559c730 "test.tkh", flags=flags@entry=6, e=e@entry=0x555555594478) at src/libduc/db-tkrzw.c:47

1 0x000055555555c75b in duc_open

(duc=duc@entry=0x555555594470, path_db=<optimized out>, flags=flags@entry=(DUC_OPEN_RW | DUC_OPEN_COMPRESS)) at src/libduc/duc.c:127

2 0x0000555555565f1f in index_main (duc=0x555555594470, argc=1, argv=0x7fffffffe4a8)

at src/duc/cmd-index.c:94

3 0x000055555555ad25 in main (argc=, argv=) at src/duc/main.c:179

(gdb)

l8gravely commented 5 months ago

So I've just used the example langc_ex1.c from the latest git repo, but all I did to compile was:

gcc langc_ex1.c -ltkrzw

on my Debian box and it works great. So I'm obviously doing something stupid in my setup above, but I don't see where since I don't pass in anything strange, unless I need to give the path name in a null terminated string?

Also, when I do get the segfault, the file is created, but I see the following:

$ tkrzw_dbm_util inspect --dbm hash test.db Inspection: class=HashDBM healthy=false auto_restored=false path=test.db cyclic_magic=2 pkg_major_version=1 pkg_minor_version=0 static_flags=1 offset_width=4 align_pow=3 closure_flags=0 num_buckets=1048583 num_records=0 eff_data_size=0 file_size=4198400 timestamp=1715289475.391163 db_type=0 max_file_size=34359738368 record_base=4198400 update_mode=in-place record_crc_mode=none record_comp_mode=none Actual File Size: 4198400 Number of Records: 0 Healthy: false Should be Rebuilt: false

As you can see, I'm passing in the explicit DBM type when creating DBs in my C code, since I have no control over what name the end user will choose for the DB I'm creating. And it does create the file, but them craps out before it writes to it to make sure it's healthy. Do I need to do something silly like sync it? Or pass in a specific option to sync it on creation? I'll keep poking at this.

estraier commented 5 months ago

In the code, you assign the database object to "db->hdb". However, you use "db" as the database object.

l8gravely commented 5 months ago

"Mikio" == Mikio Hirabayashi @.***> writes:

In the code, you assign the database object to "db->hdb". However, you use "db" as the database object.

So I'm trying to add this library into 'duc' to act as a backend storage, since Tokyocabinet and Kyotocabinet are all fairly dead projects. And since 'duc' supports multiple backends for storage, it's abstracted away some of the DB handling.

So the code looks like this, where we pass around the 'db' struct with one or more members for dealing withe various DB backends. So I'm obviously messing up somewhere, but I'm so out of touch with C programming lately... :-)

struct db { TkrzwDBM* hdb; };

struct db db_open(const char path_db, int flags, duc_errno e) { struct db db; int compress = 0; int truncate = 1; char options[] = "dbm=HashDBM,file=StdFile,truncate=true";

   TkrzwDBM* dbm = tkrzw_dbm_open(path_db, truncate, options);
   printf("created empty %s db\n",path_db);
   db->hdb = dbm;

   size_t vall;
   char *version = db_get(db, "duc_db_version", 14, &vall);
   if (version) {
       if(strcmp(version, DUC_DB_VERSION) != 0) {
           *e = DUC_E_DB_VERSION_MISMATCH;
           goto err2;
       }
       free(version);
   } else {
       db_put(db, "duc_db_version", 14, DUC_DB_VERSION, strlen(DUC_DB_VERSION));
   }

   return db;

err2: tkrzw_dbm_close(db->hdb); err1: free(db); return NULL; }

void db_close(struct db *db) { tkrzw_dbm_close(db->hdb); free(db); }

l8gravely commented 5 months ago

Hmm... so it looks like assigning the return value of tkrzw_dbm_open() to a struct member isn't working, since when I hack the code to be like this, it bombs on the first access, where in my try above, it bombs later. So obviously I've setup things in-correctly here.

struct db { TkrzwDBM* hdb; };

struct db db_open(const char path_db, int flags, duc_errno e) { struct db db; int compress = 0; int truncate = 1; char options[] = "dbm=HashDBM,file=StdFile";

    db->hdb = tkrzw_dbm_open(path_db, truncate, options);     // BOMBS HERE
    //TkrzwDBM* dbm = tkrzw_dbm_open(path_db, truncate, options);
    printf("created empty %s db\n",path_db);
    tkrzw_dbm_close(db->hdb);
    db->hdb = tkrzw_dbm_open(path_db, NULL, options);
l8gravely commented 5 months ago

So I went and took the example from the docs about calling Tkrzw from C code and ran it. No problems. Then I added in a struct db at the top like the following and now to segfaults on exit. WTF?

include

include "tkrzw_langc.h"

struct db { TkrzwDBM* hdb; };

// Main routine. int main(int argc, char* argv) { struct db db;

// Opens the database file. TkrzwDBM* dbm = tkrzw_dbm_open( "casket.tkh", true, "truncate=true,num_buckets=100");

db->hdb = dbm;

// Stores records. tkrzw_dbm_set(dbm, "foo", -1, "hop", -1, true); tkrzw_dbm_set(dbm, "bar", -1, "step", -1, true); tkrzw_dbm_set(dbm, "baz", -1, "jump", -1, true);

// Retrieves a record. char* value_ptr = tkrzw_dbm_get(dbm, "foo", -1, NULL); if (value_ptr) { puts(value_ptr); free(value_ptr); }

// Traverses records. TkrzwDBMIter iter = tkrzw_dbm_make_iterator(dbm); tkrzw_dbm_iter_first(iter); while (true) { char key_ptr = NULL; if (!tkrzw_dbm_iter_get(iter, &key_ptr, NULL, &value_ptr, NULL)) { break; } printf("%s:%s\n", key_ptr, value_ptr); free(key_ptr); free(value_ptr); tkrzw_dbm_iter_next(iter); } tkrzw_dbm_iter_free(iter);

// Closes the database file. printf("\nnow closing db->hdb\n"); //tkrzw_dbm_close(db->hdb); tkrzw_dbm_close(dbm);

return 0; }

and I compiled it with just a simple:

gcc -o tkrzw2 tkrzw2.c -ltkrzw

Next is to try this with the latest version of the library since 1.0.25 seems to be busted.

l8gravely commented 5 months ago

Still crashes with a segfault when I call from C code using the latest 1.0.29 release from the repo. I don't know what I'm doing wrong here.

rafal98 commented 5 months ago

Hi John,

struct db *db is a pointer on a struct, not a struct, so affecting a value to db->hdb is illegal. db must be allocated (on stack or heap). A possible fix is: struct db db; db.hbm = dbm

You probably should close this issue, since it's a basic question about C, not about tkrzw.

Le ven. 10 mai 2024 à 20:39, John @.***> a écrit :

Still crashes with a segfault when I call from C code using the latest 1.0.29 release from the repo. I don't know what I'm doing wrong here.

— Reply to this email directly, view it on GitHub https://github.com/estraier/tkrzw/issues/47#issuecomment-2105094247, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI3EBGIXKQZROFSL6RKT23ZBUH43AVCNFSM6AAAAABHPPOAUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBVGA4TIMRUG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

l8gravely commented 5 months ago

"rafal98" == rafal98 @.***> writes:

struct db *db is a pointer on a struct, not a struct, so affecting a value to db->hdb is illegal. db must be allocated (on stack or heap).

Yeah, my C-foo us weak since I rearely program in it. I'll take this offline and teach myself better coding skills. I hope! :-)

A possible fix is: struct db db; db.hbm = dbm

You probably should close this issue, since it's a basic question about C, not about tkrzw.

Yeah, I'm mostly trying to adapt Tkrwz into an existing C program and feeling stupid.

Thanks for all your patience with someone who's forgotten most of his C knowledge.

l8gravely commented 5 months ago

I'm a moron... I wasn't doing an allocation of 'db' before trying to add stuff (and more importantly access pointers. Duh!