Closed canadaduane closed 5 years ago
To be more specific: I've created a command-line tool that opens a tkvdb database, writes to it, and exits. When I run 4 of these processes in parallel, it seems to be working; however, that surprises me because of the multithreading caveat, and I'm wondering if it isn't really working as I intend.
No, there is no any guarantees about it. File locking is highly OS-specific, it's hard to correctly implement locks even with Linux, see http://0pointer.de/blog/projects/locking.html for example.
Looks like correct writes happened accidentally. We're writing the transaction to disk as one chunk, using one write
syscall, so the OS probably scheduled writes in correct order and did them atomically.
If you will use your OS synchronization primitives, everything should be fine. tkvdb does not use global variables, mutex in shared memory for multiprocess locks should be enough. For POSIX it may looks like
/* setup mutex with pthread_mutexattr_setpshared(attr, PTHREAD_PROCESS_SHARED) attribute */
pthread_mutex_lock(...);
transaction->begin(transaction);
transaction->get/put/cursor operations
transaction->commit(transaction);
pthread_mutex_unlock(...);
There is no need for locks on open()
and close()
If you want to increase DB performance with multiple processes, locks will not help.
There is one resource (disk file) shared between multiple processes, access performance will be even slightly less than with one process.
It's possible to speed up DB part of program if each process (or thread) will fill it's own small database, and master process will merge them into one bigger database.
We have long-term plans on multithreading and multiprocessing, but currently it's just a plans.
I see that there is a note on tkvdb not being multithreaded, but can it be multi-process when writing to disk? i.e. if several processes open the same "data.tkvdb" file, is it guaranteed that they will wait on each other to put, commit & close?