basho / bitcask

because you need another a key/value storage engine
1.29k stars 173 forks source link

Deleted keys can be resurrected after restart #82

Closed joecaswell closed 10 years ago

joecaswell commented 11 years ago

when the following conditions are met:

  1. key is written to bitcask data file 1
  2. data file 1 is cutoff and a new write file started
  3. key is deleted, writing tombstone to data file 2 and removing key from keydir
  4. data file 2 is cutoff and a new write file started
  5. Bitcask merges, data file 2 meets threshold, data file 1 does not - removes tombstone
  6. Bitcask is restarted and reopens the directory - previously deleted value is found while scanning datafiles to build keydir, no tombstone is found to cause its removal

see also https://basho.zendesk.com/agent/#/tickets/4030

slfritchie commented 11 years ago

@gburd @jonmeredith @justinsheehy @jtuple Opinions?

Usage:

./this-script 1
./this-script 2

Run this script at top of compiled bitcask source (see line 3 for -pz path). If you don't see the Merged info report before phase 1 is done, then alter the sleep time beyond the hardcoded 7 seconds.

#!/usr/bin/env escript
%%
%%! -pz ./ebin

main(["1"]) ->
    io:format("Deleting...\n"),
    os:cmd("rm -rf data"),
    io:format("Writing...\n"),
    Reference = bitcask:open("data", [read_write | opts()]),
    bitcask:put(Reference, <<"key_to_delete">>, <<"tr0ll">>),
    [ bitcask:put(Reference, term_to_binary(X), <<1:(8 * 1024 * 100)>>) || X <- lists:seq(1, 3000)],
    bitcask:delete(Reference, <<"key_to_delete">>),
    [ bitcask:put(Reference, term_to_binary(X), <<1:(8 * 1024 * 100)>>) || X <- lists:seq(1, 3000)],
    timer:sleep(1000 + 1000),
    bitcask_merge_worker:merge("data", opts(), ["data/2.bitcask.data"]),
    io:format("Sleeping...\n"),
    timer:sleep(7*1000),
    io:format("\nIf you see 'Merged' INFO REPORT, run this script with '2' arg\n");
main(["2"]) ->
    Reference = bitcask:open("data", [read_write | opts()]),
    io:format("Read zombie: ~p\n", [bitcask:get(Reference, <<"key_to_delete">>)]).

opts() ->
    [{max_file_size, 268435456},
     {dead_bytes_threshold, 89478485},
     {dead_bytes_merge_trigger, 178956970}].
gburd commented 11 years ago

gburd@carbon:~/eng/bitcask$ ./this-script 1 Deleting... Writing... Sleeping... =INFO REPORT==== 12-Mar-2013::21:31:33 === Merged ["data", [{max_file_size,268435456}, {dead_bytes_threshold,89478485}, {dead_bytes_merge_trigger,178956970}], ["data/2.bitcask.data"]] in 4.807972 seconds. If you see 'Merged' INFO REPORT, run this script with '2' arg gburd@carbon:~/eng/bitcask$ ./this-script 2

Read zombie: {ok,<<"tr0ll">>}

@gregburd | Basho Technologies | Riak | http://basho.com | @basho

On Tue, Mar 12, 2013 at 8:25 PM, Scott Lystig Fritchie < notifications@github.com> wrote:

@gburd https://github.com/gburd @jonmeredithhttps://github.com/jonmeredith @justinsheehy https://github.com/justinsheehy @jtuplehttps://github.com/jtupleOpinions?

Usage:

./this-script 1 ./this-script 2

Run this script at top of compiled bitcask source (see line 3 for -pz path). If you don't see the Merged info report before phase 1 is done, then alter the sleep time beyond the hardcoded 7 seconds.

!/usr/bin/env escript

%% %%! -pz ./ebin

main(["1"]) -> io:format("Deleting...\n"), os:cmd("rm -rf data"), io:format("Writing...\n"), Reference = bitcask:open("data", [read_write | opts()]), bitcask:put(Reference, <<"key_to_delete">>, <<"tr0ll">>), [ bitcask:put(Reference, term_to_binary(X), <<1:(8 1024 100)>>) || X <- lists:seq(1, 3000)], bitcask:delete(Reference, <<"key_to_delete">>), [ bitcask:put(Reference, term_to_binary(X), <<1:(8 1024 100)>>) || X <- lists:seq(1, 3000)], bitcask_merge_worker:merge("data", opts(), ["data/2.bitcask.data"]), io:format("Sleeping...\n"), timer:sleep(7*1000), io:format("\nIf you see 'Merged' INFO REPORT, run this script with '2' arg\n"); main(["2"]) -> Reference = bitcask:open("data", [read_write | opts()]), io:format("Read zombie: ~p\n", [bitcask:get(Reference, <<"key_to_delete">>)]).

opts() -> [{max_file_size, 268435456}, {dead_bytes_threshold, 89478485}, {dead_bytes_merge_trigger, 178956970}].

— Reply to this email directly or view it on GitHubhttps://github.com/basho/bitcask/issues/82#issuecomment-14814640 .

slfritchie commented 10 years ago

Possible fixes are brewing in https://github.com/basho/bitcask/compare/slf-merge-panopticon. Sorry about the delay, this is a thorny problem full of race conditions with merge operations, unfortunately. No guarantee yet about when this code might be ready.