Open wbrown opened 10 years ago
According to @wbrown this occurred with 2.0pre11.
I can't say if it's related (though my gut says no) but there is also a Yokozuna related error at the end. Something to investigate on my end.
Not sure if it's Yokozuna related -- I disabled Yokozuna on the bucket in question, and I'm getting a similar error:
2014-03-03 18:52:35.405 [error] <0.1439.0> gen_server <0.1439.0> terminated with reason: bad argument in call to erlang:byte_size({r_object,<<"obs">>,<<"++7rH3IfSu+XKBTEb5F/co8CJDW304bTzP/NSbUCJl0=:0">>,[{r_content,{dict,5,16,...},...}],...}) in riak_kv_index_hashtree:maybe_throttle_build/4 line 460
2014-03-03 18:52:35.405 [error] <0.1439.0> CRASH REPORT Process <0.1439.0> with 0 neighbours exited with reason: bad argument in call to erlang:byte_size({r_object,<<"obs">>,<<"++7rH3IfSu+XKBTEb5F/co8CJDW304bTzP/NSbUCJl0=:0">>,[{r_content,{dict,5,16,...},...}],...}) in riak_kv_index_hashtree:maybe_throttle_build/4 line 460 in gen_server:terminate/6 line 744
2014-03-03 18:52:35.406 [error] <0.1434.0> Supervisor {<0.1434.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.1439.0> exit with reason bad argument in call to erlang:byte_size({r_object,<<"obs">>,<<"++7rH3IfSu+XKBTEb5F/co8CJDW304bTzP/NSbUCJl0=:0">>,[{r_content,{dict,5,16,...},...}],...}) in riak_kv_index_hashtree:maybe_throttle_build/4 line 460 in context child_terminated
@wbrown Do you have entries in the crash.log that correspond to those 'bad argument'/'badarg' errors? That file should have a full stack trace (spread across multiple lines), which would make pinpointing the bad code path a lot easier.
Is this what you need?
2014-03-03 18:52:35 =CRASH REPORT====
crasher:
initial call: riak_core_vnode_worker:init/1
pid: <0.1439.0>
registered_name: []
exception exit: {{badarg,[{erlang,byte_size,[{r_object,<<"obs">>,<<"++7rH3IfSu+XKBTEb5F/co8CJDW304bTzP/NSbUCJl0=:0">>,[{r_content,{dict,5,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[[<<"dot">>|{<<110,34,76,102,83,20,248,80>>,{1,63561108748}}]],[],[],[],[],[],[[<<"content-type">>,97,112,112,108,105,99,97,116,105,111,110,47,106,115,111,110],[<<"X-Riak-VTag">>,54,101,119,76,113,98,107,122,114,57,79,84,120,77,113,84,112,121,102,116,65,78]],[[<<"index">>,{<<"disc_i_int">>,0},{<<"e_md5_s_bin">>,<<"4d8f0d4ea04b84488883d338ea13fef3">>},{<<"e_name_s_bin">>,<<"997D3CF373AFA82224049C4C356851253B9A70B6">>},{<<"e_sha256_s_bin">>,<<"7065f1f53144cdde8caaf578aafd7110ef8c9142fa1dc0e6869269bdbd03ce31">>},{<<"etp_sid_i_bin">>,<<"2017911">>},{<<"etp_sid_i_bin">>,<<"2017912">>},{<<"etp_sid_i_bin">>,<<"2805709">>},{<<"n_dom_s_bin">>,<<"dc-b2b0e285.getapplicationmy.info">>},{<<"n_dom_type_s_bin">>,<<"A">>},{<<"n_ipv4_l_int">>,919197470},{<<"owner_s_bin">>,<<"A7Ifb3azxNWfjVRtShg99kWFPLxXFQKvHFgLIkgJRnA=">>},{<<"r_magic_s_bin">>,<<"PE32 executable for MS Windows (GUI) Intel 80386 32-bit">>},{<<"r_ts_i_int">>,1393881722},{<<"r_ver_i_int">>,0},{<<"src_s_bin">>,<<"Bvn4ba+aeAW7V1wukHqx7cMah9sFa+rqjah/BrrhQtM=">>},{<<"ts_i_int">>,1393889687},{<<"type_s_bin">>,<<"n_dom_lookup">>}]],[],[[<<"X-Riak-Last-Modified">>|{1393,889548,782994}]],[],[]}}},<<"{\"ts_i\": 1393889687, \"type_s\": \"n_dom_lookup\", \"src_s\": \"Bvn4ba+aeAW7V1wukHqx7cMah9sFa+rqjah/BrrhQtM=\", \"owner_s\": \"A7Ifb3azxNWfjVRtShg99kWFPLxXFQKvHFgLIkgJRnA=\", \"disc_i\": 0, \"r_ts_i\": 1393881722, \"e_md5_s\": \"4d8f0d4ea04b84488883d338ea13fef3\", \"etp_sid_I\": [\"2805709\", \"2017912\", \"2017911\"], \"e_sha256_s\": \"7065f1f53144cdde8caaf578aafd7110ef8c9142fa1dc0e6869269bdbd03ce31\", \"r_magic_s\": [\"PE32 executable for MS Windows (GUI) Intel 80386 32-bit\"], \"e_name_s\": [\"997D3CF373AFA82224049C4C356851253B9A70B6\"], \"n_dom_s\": \"dc-b2b0e285.getapplicationmy.info\", \"n_dom_type_s\": \"A\", \"n_IPv4_L\": 919197470, \"r_ver_i\": 0}">>}],[{<<110,34,76,102,83,20,248,80>>,{1,63561108748}}],{dict,1,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[[clean|true]],[]}}},undefined}],[]},{riak_kv_index_hashtree,maybe_throttle_build,4,[{file,"src/riak_kv_index_hashtree.erl"},{line,460}]},{eleveldb,fold_loop,4,[{file,"src/eleveldb.erl"},{line,347}]},{eleveldb,do_fold,4,[{file,"src/eleveldb.erl"},{line,334}]},{riak_kv_eleveldb_backend,'-fold_objects/4-fun-0-',4,[{file,"src/riak_kv_eleveldb_backend.erl"},{line,477}]},{riak_kv_worker,handle_work,3,[{file,"src/riak_kv_worker.erl"},{line,46}]},{riak_core_vnode_worker,handle_cast,2,[{file,"src/riak_core_vnode_worker.erl"},{line,70}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,604}]}]},[{gen_server,terminate,6,[{file,"gen_server.erl"},{line,744}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
ancestors: [<0.1434.0>,<0.1433.0>,<0.1432.0>,<0.1237.0>,riak_core_vnode_sup,riak_core_sup,<0.153.0>]
messages: []
links: [<0.1434.0>,<0.1433.0>]
dictionary: []
trap_exit: false
status: running
heap_size: 2586
stack_size: 27
reductions: 773
neighbours:
2014-03-03 18:52:35 =SUPERVISOR REPORT====
Supervisor: {<0.1434.0>,poolboy_sup}
Context: child_terminated
Reason: {badarg,[{erlang,byte_size,[{r_object,<<"obs">>,<<"++7rH3IfSu+XKBTEb5F/co8CJDW304bTzP/NS
bUCJl0=:0">>,[{r_content,{dict,5,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[[<<"dot">>|{<<110,34,76,102,83,20,248,80>>,{1,63561108748}}]],[],[],[],[],[],[[<<"content-type">>,97,112,112,108,105,99,97,116,105,111,110,47,106,115,111,110],[<<"X-Riak-VTag">>,54,101,119,76,113,98,107,122,114,57,79,84,120,77,113,84,112,121,102,116,65,78]],[[<<"index">>,{<<"disc_i_int">>,0},{<<"e_md5_s_bin">>,<<"4d8f0d4ea04b84488883d338ea13fef3">>},{<<"e_name_s_bin">>,<<"997D3CF373AFA82224049C4C356851253B9A70B6">>},{<<"e_sha256_s_bin">>,<<"7065f1f53144cdde8caaf578aafd7110ef8c9142fa1dc0e6869269bdbd03ce31">>},{<<"etp_sid_i_bin">>,<<"2017911">>},{<<"etp_sid_i_bin">>,<<"2017912">>},{<<"etp_sid_i_bin">>,<<"2805709">>},{<<"n_dom_s_bin">>,<<"dc-b2b0e285.getapplicationmy.info">>},{<<"n_dom_type_s_bin">>,<<"A">>},{<<"n_ipv4_l_int">>,919197470},{<<"owner_s_bin">>,<<"A7Ifb3azxNWfjVRtShg99kWFPLxXFQKvHFgLIkgJRnA=">>},{<<"r_magic_s_bin">>,<<"PE32 executable for MS Windows (GUI) Intel 80386 32-bit">>},{<<"r_ts_i_int">>,1393881722},{<<"r_ver_i_int">>,0},{<<"src_s_bin">>,<<"Bvn4ba+aeAW7V1wukHqx7cMah9sFa+rqjah/BrrhQtM=">>},{<<"ts_i_int">>,1393889687},{<<"type_s_bin">>,<<"n_dom_lookup">>}]],[],[[<<"X-Riak-Last-Modified">>|{1393,889548,782994}]],[],[]}}},<<"{\"ts_i\": 1393889687, \"type_s\": \"n_dom_lookup\", \"src_s\": \"Bvn4ba+aeAW7V1wukHqx7cMah9sFa+rqjah/BrrhQtM=\", \"owner_s\": \"A7Ifb3azxNWfjVRtShg99kWFPLxXFQKvHFgLIkgJRnA=\", \"disc_i\": 0, \"r_ts_i\": 1393881722, \"e_md5_s\": \"4d8f0d4ea04b84488883d338ea13fef3\", \"etp_sid_I\": [\"2805709\", \"2017912\", \"2017911\"], \"e_sha256_s\": \"7065f1f53144cdde8caaf578aafd7110ef8c9142fa1dc0e6869269bdbd03ce31\", \"r_magic_s\": [\"PE32 executable for MS Windows (GUI) Intel 80386 32-bit\"], \"e_name_s\": [\"997D3CF373AFA82224049C4C356851253B9A70B6\"], \"n_dom_s\": \"dc-b2b0e285.getapplicationmy.info\", \"n_dom_type_s\": \"A\", \"n_IPv4_L\": 919197470, \"r_ver_i\": 0}">>}],[{<<110,34,76,102,83,20,248,80>>,{1,63561108748}}],{dict,1,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[[clean|true]],[]}}},undefined}],[]},{riak_kv_index_hashtree,maybe_throttle_build,4,[{file,"src/riak_kv_index_hashtree.erl"},{line,460}]},{eleveldb,fold_loop,4,[{file,"src/eleveldb.erl"},{line,347}]},{eleveldb,do_fold,4,[{file,"src/eleveldb.erl"},{line,334}]},{riak_kv_eleveldb_backend,'-fold_objects/4-fun-0-',4,[{file,"src/riak_kv_eleveldb_backend.erl"},{line,477}]},{riak_kv_worker,handle_work,3,[{file,"src/riak_kv_worker.erl"},{line,46}]},{riak_core_vnode_worker,handle_cast,2,[{file,"src/riak_core_vnode_worker.erl"},{line,70}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,604}]}]}
Offender: [{pid,<0.1439.0>},{name,riak_core_vnode_worker},{mfargs,{riak_core_vnode_worker,start_link,undefined}},{restart_type,temporary},{shutdown,5000},{child_type,worker}]
@slfritchie looks like maybe_throttle_build is being passed a non-binary object to me? Perhaps here https://github.com/basho/riak_kv/blob/develop/src/riak_kv_index_hashtree.erl#L482
Maybe use erlang:external_size in `maybe_throttle_build``` if you can tolerate minor extra overhead for real binaries, or be more picky like https://github.com/basho/riak_kv/blob/develop/src/riak_kv_index_hashtree.erl#L422
@jonmeredith Yes, but then there's the question of why is a non-binary object occasionally given to a function that is supposed(*) to be handling binaries only?
(*) Or so says my bad memory. My head is still in Bitcask Land(™), so I haven't had time to review this bug further.
Well I'll be... I've been reviewing the code and I can not see how a non-binary could be passed to this function. @jonmeredith hash_object is that way because different paths in riak_kv_vnode that call update_hashtree may have the binary or the object available (handoff vs. regular puts, etc). But this function should be used for a backend fold.
@russelldb I noticed you took this item. During the triage with Kelly and Michelle, I believe we agreed to move this out of the milestone as not reproducible. If it pops up again in current code, we can open a new issue.
So can we change the milestone, please?
Moved from RC -> 2.0.1. @wbrown we are definitely interested in a reproducible case. Ping if you think you have one.
Is there anything new to report for this bug report?
still experiencing this issue.
is there any quick fix?
Storing unencoded binary keys and values in Riak prerelease causes issues under load with
riak_kv_index_hashtree:maybe_throttle_build
: