basho / riak_kv

Riak Key/Value Store
Apache License 2.0
650 stars 233 forks source link

riak_kv_crdt:from_map is broken for map #967

Closed rzezeski closed 10 years ago

rzezeski commented 10 years ago

Recently this failure occurred in Yokozuna eunit with no changes made to Yokozuna code:

module 'yz_dt_extractor'
  module 'yz_dt_extractor_tests'
    yz_dt_extractor_tests: counter_test...[0.033 s] ok
    yz_dt_extractor_tests: set_test...[0.021 s] ok
    yz_dt_extractor_tests: map_test...*failed*
in function yz_dt_extractor:extract_fields/4 (src/yz_dt_extractor.erl, line 95)
  called as extract_fields({undefined,<<"name">>},undefined,<<"Ryan Zezeski">>,{state,[{<<"page_views_counter">>,<<"1502">>},
        {<<"phones_set">>,<<"867-5309">>},
        {<<"phones_set">>,<<"555-5555">>}],
       <<".">>})
in call from lists:foldl/3 (lists.erl, line 1248)
in call from yz_dt_extractor:extract_fields/2 (src/yz_dt_extractor.erl, line 91)
in call from yz_dt_extractor_tests:map_test/0 (test/yz_dt_extractor_tests.erl, line 28)
**error:function_clause

    yz_dt_extractor_tests: field_separator_test...*failed*
in function yz_dt_extractor:extract_fields/4 (src/yz_dt_extractor.erl, line 95)
  called as extract_fields({undefined,<<"name">>},undefined,<<"Ryan Zezeski">>,{state,[{<<"page_views_counter">>,<<"1502">>},
        {<<"phones_set">>,<<"867-5309">>},
        {<<"phones_set">>,<<"555-5555">>}],
       <<"--">>})
in call from lists:foldl/3 (lists.erl, line 1248)
in call from yz_dt_extractor:extract_fields/2 (src/yz_dt_extractor.erl, line 91)
in call from yz_dt_extractor_tests:field_separator_test/0 (test/yz_dt_extractor_tests.erl, line 41)
**error:function_clause

I reproduced on my local workstation and added a print out of the #crdt record that was causing the failure:

{crdt,riak_dt_map,"application/riak_map",
            {[{<<0>>,1}],
             [{{<<"activated">>,riak_dt_od_flag},
               {[{{<<0>>,1},{[{<<0>>,1}],[{<<0>>,1}],[]}}],{[],[],[]}}},
              {{<<"events">>,riak_dt_map},
               {[{{<<0>>,1},
                  {[{<<0>>,1}],
                   [{{<<"RICON">>,riak_dt_lwwreg},
                     {[{{<<0>>,1},{<<"spoke">>,1401737230382792}}],{<<>>,0}}},
                    {{<<"Surge">>,riak_dt_lwwreg},
                     {[{{<<0>>,1},{<<"attended">>,1401737230382793}}],
                      {<<>>,0}}}],
                   []}}],
                {[],[],[]}}},
              {{<<"name">>,riak_dt_lwwreg},
               {[{{<<0>>,1},{<<"Ryan Zezeski">>,1401737230382783}}],{<<>>,0}}},
              {{<<"page_views">>,riak_dt_pncounter},
               {[{{<<0>>,1},[{<<0>>,1502,0}]}],[]}},
              {{<<"phones">>,riak_dt_orswot},
               {[{{<<0>>,1},
                  {[{<<0>>,1}],
                   [{<<"555-5555">>,[{<<0>>,1}]},{<<"867-5309">>,[{<<0>>,1}]}],
                   []}}],
                {[],[],[]}}}],

Notice the value for mod is riak_dt_map. If you look at the implementation for riak_kv_crdt:from_mod/1 you'll notice it uses ?MOD_MAP which expects map as the key. I think a bug was introduced in basho/riak_kv#e28be05 which seems to have mad some changes around module mapping. The fix seems simple; change the function/mapping to use the module name, but I'm not privy to this code and thought it best to leave it to the CRDT sages @seancribbs and @russelldb.

coderoshi commented 10 years ago

This looks to be popping up in other tests, including the Ruby client:

3) CRDTs maps should allow straightforward map ops
     Failure/Error: expect(subject.registers['first']).to eq('hello')

       expected: "hello"
            got: nil

       (compared using ==)
     # ./spec/integration/riak/crdt_spec.rb:90:in `block (3 levels) in <top (required)>'

  4) CRDTs maps should allow batched map ops
     Failure/Error: expect(subject.registers['condiment']).to eq 'ketchup'

       expected: "ketchup"
            got: nil

       (compared using ==)
     # ./spec/integration/riak/crdt_spec.rb:126:in `block (3 levels) in <top (required)>'

  5) CRDTs maps containing a flag should bubble straightforward flag ops up
     Failure/Error: expect(subject.flags['enable_magic']).to be
     ArgumentError:
       wrong number of arguments (1 for 2)
     # ./lib/riak/crdt/inner_flag.rb:9:in `new'
     # ./lib/riak/crdt/typed_collection.rb:54:in `[]'
     # ./spec/integration/riak/crdt_spec.rb:163:in `block (4 levels) in <top (required)>'

  6) CRDTs maps containing a register should bubble straightforward register ops up
     Failure/Error: expect(subject.registers['hkey_local_machine']).to eq 'registry'

       expected: "registry"
            got: nil

       (compared using ==)
     # ./spec/integration/riak/crdt_spec.rb:155:in `block (4 levels) in <top (required)>'

  7) CRDTs maps containing a map should include inner-map ops in the outer-map batch
     Failure/Error: expect(subject.maps['road'].counters['speedbumps'].value).to eq 4

       expected: 4
            got: 0

       (compared using ==)
     # ./spec/integration/riak/crdt_spec.rb:146:in `block (4 levels) in <top (required)>'

http://doorman.jtuple.com:8081/builders/ruby-client-test/builds/2/steps/make%20test/logs/stdio

russelldb commented 10 years ago

I'll take this. My guess is the @rzezeski issue is because the new riak_kv_crdt:mod_map/1 should be used, rather than ?MOD_MAP. I know that map is the key, but we use lists:keyfind(Mod, 2, ?MOD_MAP), as a reverse look up from mod to type. I guess it is the new embedded counter that is the issue. that riak_dt_pncounter gives it away, we don't embed those anymore.

I'm pretty sure @coderoshi's issue is just that add has been removed from map and the clients are yet to catch up. The client should never EVER know about things like modules that implement types, so it had better not be related to the ?MOD_MAP stuff (cc @bkerley)

Both those are guesses, I haven't looked at the code. Will do it in the UK morning.

seancribbs commented 10 years ago

@coderoshi I think that's unrelated, similar tests in other clients don't barf on beta5.

bkerley commented 10 years ago

I'm pretty sure @coderoshi's issue is just that add has been removed from map and the clients are yet to catch up.

The ruby client has never used add; that's not the problem here:

    # the rspec context
    it 'should allow straightforward map ops' do
      subject.registers['first'] = 'hello'
      expect(subject.registers['first']).to eq('hello')

writing "hello" to a register inside a map"

russelldb commented 10 years ago

But does it use PB? Have you updated the .proto file? It's changed since add is dropped. @bkerley ping me when you get on and teach me how to ruby so I can run these tests, please?

russelldb commented 10 years ago

@coderoshi please open a new issue for the ruby test.

bkerley commented 10 years ago

Ah, I didn't know about the other changes besides not using add. The PB stuff in Ruby has been updated: https://github.com/basho/riak-ruby-client/commit/3af5d7f9cb3e5f55649470c2062748db7fe76f58

russelldb commented 10 years ago

https://github.com/basho/riak-ruby-client/commit/3af5d7f9cb3e5f55649470c2062748db7fe76f58 addresses the ruby client issues so no need for a new issue @coderoshi. Thanks @bkerley. Closing in 3…2…1…