basho / riak

Riak is a decentralized datastore from Basho Technologies.
http://docs.basho.com
Apache License 2.0
3.94k stars 536 forks source link

Riak Crashes with Out of Memory Error #914

Closed southerncloudz closed 7 years ago

southerncloudz commented 7 years ago

I am using a single node riak server 2.1.4 with bitcask as backend for storing files. The riak node stops working after every week. (Looks like when the active anti-entropy process recreates the hash tree) The sylog shows Out of memory Error. But the console.log shows "sst: No such file or directory" Syslog Error:

Apr 26 17:39:37 TLCCBAPRO2 kernel: Out of memory: Kill process 16685 (beam.smp) score 824 or sacrifice child Apr 26 17:39:37 TLCCBAPRO2 kernel: Killed process 16987 (sh) total-vm:106168kB, anon-rss:116kB, file-rss:0kB Apr 26 17:39:41 TLCCBAPRO2 kernel: Out of memory: Kill process 16685 (beam.smp) score 824 or sacrifice child Apr 26 17:39:41 TLCCBAPRO2 kernel: Killed process 30374 (memsup) total-vm:4112kB, anon-rss:80kB, file-rss:0kB Apr 26 17:39:41 TLCCBAPRO2 kernel: Out of memory: Kill process 16685 (beam.smp) score 824 or sacrifice child Apr 26 17:39:41 TLCCBAPRO2 kernel: Killed process 14351 (cpu_sup) total-vm:4112kB, anon-rss:68kB, file-rss:0kB Apr 26 17:39:41 TLCCBAPRO2 kernel: Out of memory: Kill process 16685 (beam.smp) score 824 or sacrifice child Apr 26 17:39:41 TLCCBAPRO2 kernel: Killed process 30385 (sh) total-vm:106164kB, anon-rss:136kB, file-rss:416kB Apr 26 17:44:48 TLCCBAPRO2 run_erl[16682]: Erlang closed the connection.

Console.log:

2017-04-26 17:37:03.493 [info] <0.625.0>@riak_kv_vnode:maybe_create_hashtrees:227 riak_kv/91343852333181432387730302044767688728495783936: unable to start index_hashtree: {error,{{badmatch,{error,{db_open,"IO error: ./data/anti_entropy/91343852333181432387730302044767688728495783936/sst_0/001954.sst: No such file or directory"}}},[{hashtree,new_segment_store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_hashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}} 2017-04-26 17:37:03.515 [error] <0.30178.2881> CRASH REPORT Process <0.30178.2881> with 0 neighbours exited with reason: no match of right hand value {error,{db_open,"IO error: ./data/anti_entropy/936274486415109681974235595958868809467081785344/000037.sst: No such file or directory"}} in hashtree:new_segment_store/2 line 675 in gen_server:init_it/6 line 328 2017-04-26 17:37:03.515 [info] <0.623.0>@riak_kv_vnode:maybe_create_hashtrees:227 riak_kv/45671926166590716193865151022383844364247891968: unable to start index_hashtree: {error,{{badmatch,{error,{db_open,"IO error: ./data/anti_entropy/45671926166590716193865151022383844364247891968/sst_0/002239.sst: No such file or directory"}}},[{hashtree,new_segment_store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_hashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}} 2017-04-26 17:37:03.516 [error] <0.30207.2881> CRASH REPORT Process <0.30207.2881> with 0 neighbours exited with reason: no match of right hand value {error,{db_open,"IO error: ./data/anti_entropy/45671926166590716193865151022383844364247891968/sst_0/002239.sst: No such file or directory"}} in hashtree:new_segment_store/2 line 675 in gen_server:init_it/6 line 328

lukebakken commented 7 years ago

Your server needs more memory. In addition, single-node Riak servers should not be used in production. If you wish to discuss this further, please use the riak-users mailing list, and provide more detail.

southerncloudz commented 7 years ago

Hi lukebakken,

We have only 2013 keys and the system has 16 GB of RAM and 50 GB of Hard Drive. I think the hardware is not too low for 2013 keys. In another machine we have tested with 15000+ similar keys and did not face any problem.

P.S I have sent an email to the riak-users mailing list also.