Closed andytill closed 2 years ago
Is this something for which it is even possible to create a Riak test for?
Oh yeah, and if you do merge it, please close the related issues.
Is there a Riak top level project branch somewhere which is already configured to use your branch @andytill ? The code looks +1 but , just to be sure, I would like to run it against some data if possible.
Sorry no, it's all linked in our internal riak_ee.
Is there a Riak top level project branch somewhere which is already configured to use your branch @andytill ? The code looks +1 but , just to be sure, I would like to run it against some data if possible.
Might it be worth considering fnctl
instead of flock
for portabilities sake?
https://www.perkin.org.uk/posts/solaris-portability-flock.html
is a good summary of the topic.
Thanks @Licenser. You linked to https://github.com/basho/leveldb/blob/44cb7cbc85590280c9a73856470d5880f4015927/util/env_posix.cc#L493 in slack, I'm just commenting here to bring that link into this discussion
Very difficult I think to reproduce the OS pid on the restarted Riak to provoke the original bug. With the change it is possible to trigger the locked error by starting starting Riak, perform one write, starting a second Riak and performing a write with that which will return an error.
Is this something for which it is even possible to create a Riak test for?
Oh yes, sorry @russelldb should have added that too.
I’m happy without a riak_test, if there is no test to show the fault pre-fix.
On 6 Mar 2018, at 10:58, Andy Till notifications@github.com wrote:
Very difficult I think to reproduce the OS pid on the restarted Riak to provoke the original bug. With the change it is possible to trigger the locked error by starting starting Riak, perform one write, starting a second Riak and performing a write with that which will return an error.
Is this something for which it is even possible to create a Riak test for?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
I'd definitely echo the suggestion to look at fnctl, the link Heinz posted even has some compatibility shims in addition to the eleveldb one.
On the CI server (my laptop) the quickcheck tests fail - output attached.
It would be brilliant if we could get the tests to pass - then +1 from me.
With the current implementation I caused data loss (last year) when opening a bitcask data directory twice, despite the the second open was flagged as read_only (via a second beam) any attention to this area feels useful.
Imho - fnctl
is only a consideration for Solaris users - @heinz is not using Bitcask on Solaris, which means there are zero stakeholders I know of for whom this would cause direct inconvenience.
wrt above comment - emfile - I've increased file descriptors and rerun the quickcheck tests.
@bryanhuntesl that failure is emfile, are you sure you have your ulimit high enough?
Increased emfile limit to 10000 tests still failing - output attached.
@bryanhuntesl can you add some info (erl + eqc version, platform) I ran this 10 times today without failure, on my macbook
Certainly - Macbook - i7/16GB
Erlang R16B02-basho1 (erts-5.10.3) [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
▷ echo $ERL_LIBS (re-formatted)
:/Users/bryanhunt/quickcheck-1.36.1:
/Users/bryanhunt/quickcheck-1.36.1/eqc-1.36.1:
/Users/bryanhunt/quickcheck-1.36.1/pulse-1.36.1:
/Users/bryanhunt/quickcheck-1.36.1/pulse_otp-1.36.1
Not gonna moan about basho1 (should be basho10). Interesting. All the same as me. Ah well. Good luck someone with that one!
Just to confirm - I am running 16b02-basho10 - seems like it's been truncated in the erl console output.
I am using bitcask on SmartOS. But that aside I see little reason not to use a POSIX compliant function when there is one.
Sorry, my mistake, I thought you were doing everything with raw files/Postgres.
So many systems ;) in dalmatinerDB yes, it's files/postgres/(leveldb sadly that's tied to _core), snarl my AAA system uses bitcask for the timeouts. I'm happy to make a pullrequest on the pullrequest to move it to fnctl, it seems just to be once place that needs changes.
Thanks for clarification @Licenser
@Licenser thanks for offering to make this Solaris compatible, I'd welcome this change to the PR.
The lock test does seem to fail on OSX (unmodified code):
bitcask_lockops_tests: lock_cannot_be_obtained_on_already_locked_file_within_same_os_process_test...*failed*
in function bitcask_lockops_tests:'-lock_cannot_be_obtained_on_already_locked_file_within_same_os_process_test/0-fun-0-'/1 (test/bitcask_lockops_tests.erl, line 34)
**error:{assertEqual,[{module,bitcask_lockops_tests},
{line,34},
{expression,"bitcask_lockops : acquire ( write , Dir )"},
{expected,{error,locked}},
{value,{ok,<<>>}}]}
output:<<"">>
Odd this did work on the second run, not sure as to why.
The eqc tests segfault :(
bitcask_nifs: keydir_get_put_test_...........................................................................................make: *** [eunit_nif] Segmentation fault: 11
the calls tack would be:
Process 22623 stopped
* thread #16, name = '2_scheduler', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x00000000147dbea8 beam.smp`enif_send(env=0x00000000b06ba868, to_pid=<unavailable>, msg_env=0x0000000015f40818, msg=566859) at erl_nif.c:324 [opt]
321
322 if (env != NULL) {
323 c_p = env->proc;
-> 324 if (receiver == c_p->common.id) {
325 rp_locks = ERTS_PROC_LOCK_MAIN;
326 flush_me = 1;
327 }
Target 0: (beam.smp) stopped.
(lldb) bt
* thread #16, name = '2_scheduler', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x00000000147dbea8 beam.smp`enif_send(env=0x00000000b06ba868, to_pid=<unavailable>, msg_env=0x0000000015f40818, msg=566859) at erl_nif.c:324 [opt]
frame #1: 0x0000000014aea28a bitcask.so`msg_pending_awaken(env=0x00000000b06ba868, keydir=0x00007fa0dbf0ba10, msg=566859) at bitcask_nifs.c:2752
frame #2: 0x0000000014ae5c4f bitcask.so`merge_pending_entries(env=0x00000000b06ba868, keydir=0x00007fa0dbf0ba10) at bitcask_nifs.c:2826
frame #3: 0x0000000014ae5abd bitcask.so`itr_release_internal(env=0x00000000b06ba868, handle=0x0000000015ac1850) at bitcask_nifs.c:1995
frame #4: 0x0000000014ae6449 bitcask.so`bitcask_nifs_keydir_resource_cleanup(env=0x00000000b06ba868, arg=0x0000000015ac1850) at bitcask_nifs.c:2920
frame #5: 0x00000000147de2d2 beam.smp`nif_resource_dtor(bin=<unavailable>) at erl_nif.c:1420 [opt]
frame #6: 0x00000000147b2173 beam.smp`sweep_off_heap [inlined] erts_bin_free(bp=0x0000000015ac1828) at erl_binary.h:331 [opt]
frame #7: 0x00000000147b2165 beam.smp`sweep_off_heap(p=0x00000000161824b0, fullsweep=0) at erl_gc.c:2371 [opt]
frame #8: 0x00000000147b04ec beam.smp`erts_garbage_collect [inlined] do_minor(p=<unavailable>, new_sz=<unavailable>, objv=<unavailable>, nobj=<unavailable>) at erl_gc.c:1180 [opt]
frame #9: 0x00000000147af16f beam.smp`erts_garbage_collect at erl_gc.c:880 [opt]
frame #10: 0x00000000147aeef2 beam.smp`erts_garbage_collect(p=<unavailable>, need=<unavailable>, objv=0x00000000b06bac78, nobj=1) at erl_gc.c:452 [opt]
frame #11: 0x00000000147ada07 beam.smp`erts_gc_after_bif_call(p=0x00000000161824b0, result=<unavailable>, regs=<unavailable>, arity=<unavailable>) at erl_gc.c:373 [opt]
frame #12: 0x00000000146b3418 beam.smp`process_main at beam_emu.c:2881 [opt]
frame #13: 0x00000000147450a1 beam.smp`sched_thread_func(vesdp=0x00000000151782c0) at erl_process.c:8118 [opt]
frame #14: 0x000000001486ea92 beam.smp`thr_wrapper(vtwd=0x00007ffeeb55ae50) at ethread.c:114 [opt]
frame #15: 0x00007fff717ab6c1 libsystem_pthread.dylib`_pthread_body + 340
frame #16: 0x00007fff717ab56d libsystem_pthread.dylib`_pthread_start + 377
frame #17: 0x00007fff717aac5d libsystem_pthread.dylib`thread_start + 13
Still waiting to get my eqc license approved by work, then I'll try to reproduce.
Withfcntl(…, F_SETLK, …)
(the only thing POSIX has), any close
would lead to the process losing all its locks to that file.
Previous to this change, the OS pid was used to lock the bitcask database and prevent two bitcask instances performing write operations at the same time. If the pid in the lock file is currently taken by an OS process then the bitcask DB may not be written to. Bitcask is not able to clean up the write locks because of a hard shutdown it may not be able to delete the "stale" lock file and obtain a new one for the following reasons:
This has been raised in issues #72, #226 and basho/riak#535.
This change uses the flock system call to obtain exclusive write access to the file across OS processes, as long as the other processes use flock to obtain the lock.
The change maintains the previous behaviour where the DB can be opened even if a write lock cannot be obtained, but returns an error on write operations.
The
O_EXCL
flag has been removed from the call toopen
in the nifs because if the lock file still exists, it returns theeexist
error but now the proof of the write lock is not the file existing but the call toflock
.