ArweaveTeam / arweave

The Arweave server and App Developer Toolkit.
https://www.arweave.org
GNU General Public License v2.0
918 stars 202 forks source link

[error] gen_server.erl:949 Generic server ar_chunk_storage terminating. Reason: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,...},{gen_server,...}|...] #385

Open ThomasBlock opened 2 years ago

ThomasBlock commented 2 years ago

Hi Lev, thank you for your continous work here. As i use arweave more and more, i discover more and more errors. i check the old issues, but my errors seem all to be "new". are you interested in all of them? or should we accept the fact that there are problems from time to time..

This crash leaves arweave in a "half open" state - process still running but not responding to http request or doing anything else. But luckily it can be fixed with a restart.

2022-08-29T08:02:42.088748+02:00 [error] gen_server.erl:949 Generic server ar_chunk_storage terminating. Reason: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,...},{gen_server,...}|...]}. Last message: {put,33385181651190,<<253,208,126,154,145,28,235,77,206,48,192,...>>}. State: {state,#{89030393856000 =
> <<"8903"...>>,92283076608000 => <<...>>,...}}. Client ar_data_sync stacktrace: [{gen,do_call,4,[{...}|...]},{gen_server,call,2,[...]},{ar_chunk_storage,put,2,...},{ar_data_sync,write_not_b...
2022-08-29T08:02:42.093317+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_chunk_storage:init/1, pid: <0.1202.0>, registered_name: ar_chunk_storage, error: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,[...]},{gen_server,try_handle_call,4,...},{gen_server,handle_msg,...},{proc_lib,...}]}, ance
stors: [ar_sup|...], message_queue_len: ..., messages: [{'$gen_call',...}], links: [<0.921.0>|...], dictionary: [{...}|...], trap_exit: true, status: running, heap_size: 318187, stack_size: ...
2022-08-29T08:02:42.094929+02:00 [error] supervisor.erl:736 Supervisor: {local,ar_sup}. Context: child_terminated. Reason: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,...},{gen_server,try_handle_call,...}|...]}. Offender: id=ar_chunk_storage,pid=<0.1202.0>.
2022-08-29T08:02:43.020879+02:00 [info] /root/arweave/apps/arweave/src/ar_data_sync.erl:931 event: terminate, reason: {{case_clause, {'EXIT', {{{badmatch,{error,timeout}}, [{ar_chunk_storage,handle_call,3, [{file,"/root/arweave/apps/arweave/src/ar_chunk_storage.erl"}, {line,140}]}, {gen_server,try_handle_call,4, [{fi
le,"gen_server.erl"},{line,721}]}, {gen_server,handle_msg,6,[{file,"gen_server.erl"},...
2022-08-29T08:02:43.327186+02:00 [error] : Ranch listener ar_http_iface_listener, connection process <0.9851.80>, stream 1 had its request process <0.11950.81> exit with reason badarg and stacktrace [{rocksdb,iterator_move,[#Ref<0.1779065360.2965504068.66031>,{seek,<<...>>}],[]},{ar_kv,get_next_by_prefix,4,[{file,...
},{...}]},{ar_data_sync,get_chunk,7,[{...}|...]},{ar_http_iface_middleware,handle_get_chunk,3,[...]},{ar_http_iface_middleware,handle,3,...},{ar_http_iface_middleware,'-execute/2-fun-0-',...}]
2022-08-29T08:02:43.326801+02:00 [error] : Error in process <0.29819.80> on node 'arweave@127.0.0.1' with exit value:, {badarg,[{rocksdb,iterator_move,[#Ref<0.1779065360.2965504068.66031>,{seek,<<0,0,...>>}],[]},{ar_kv,get_next_by_prefix,4,[{file,"/root/arweave/apps/arweave/src/ar_kv.erl"},{line,77}]},{ar_data_sync,g
et_chunk,7,[{file,[...]},{line,...}]},{ar_http_iface_middleware,handle_get_chunk,3,[{file,...},{...}]},{ar_http_iface_middleware,handle,3,[{...}|...]},{ar_http_iface_middleware,'-execute/2-f...
2022-08-29T08:02:43.337157+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 63366047506679
2022-08-29T08:02:43.338014+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 63366018932983
2022-08-29T08:02:43.351236+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 82759838376183
2022-08-29T08:02:43.351890+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 63365988524279
...
2022-08-29T08:02:43.440096+02:00 [error] gen_server.erl:949 Generic server ar_data_sync terminating. Reason: {{case_clause,{'EXIT',{{...},...}}},[{ar_data_sync,store_chunk,3,...},{ar_data_sync,handle_info,...},{gen_server,...},{...}|...]}. Last message: {event,chunk,{packed,33385122406646,{spora_2_5,<<128,65,121,245,
9,179,186,...>>,33385122406646,<<"^W¶Þ$~~Íw]"...>>,262144}}}. State: {sync_data_state,[{<<";B"...>>,96711997694198,...},{<<...>>,...},{...}|...],96711997694198,{#Ref<0.1779065360.2965241922....
2022-08-29T08:02:43.442094+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_data_sync:init/1, pid: <0.1205.0>, registered_name: ar_data_sync, error: {{case_clause,{'EXIT',{{{...},...},{...}}}},[{ar_data_sync,store_chunk,3,[...]},{ar_data_sync,handle_info,2,...},{gen_server,try_dispatch,...},{gen_server,...},{
...}]}, ancestors: [ar_sup|...], message_queue_len: ..., messages: [{...}|...], links: [<0.921.0>|...], dictionary: [], trap_exit: true, status: running, heap_size: 3581853, stack_size: 29, ...
2022-08-29T08:02:43.443371+02:00 [error] supervisor.erl:736 Supervisor: {local,ar_sup}. Context: child_terminated. Reason: {{case_clause,{'EXIT',{{...},...}}},[{ar_data_sync,store_chunk,3,...},{ar_data_sync,handle_info,...}|...]}. Offender: id=ar_data_sync,pid=<0.1205.0>.
...
2022-08-29T08:03:27.360992+02:00 [notice] application_controller.erl:2119 Application: arweave. Exited: shutdown. Type: permanent.
     Loaded: loaded (/etc/systemd/system/arweave.service; disabled; vendor preset: enabled)
     Active: active (running) since Sun 2022-08-28 21:11:55 CEST; 11h ago
    Process: 2135378 ExecStartPre=/sbin/sysctl -w vm.nr_hugepages=1250 (code=exited, status=0/SUCCESS)
   Main PID: 2135379 (bash)
      Tasks: 126 (limit: 629145)
     Memory: 296.1G
        CPU: 6d 20h 44min 48.397s
     CGroup: /system.slice/arweave.service
             ├─2135379 bash /opt/arweave/bin/start config_file /opt/arweave/config.json
             ├─2154675 /opt/arweave/bin/arweave -Bd -MBas aobf -MBlmbcs 512 -Ktrue -A20 -SDio20 -sbwtvery_long -sbwtdcpuvery_long -sbwtdiovery_long -swtvery_>
             ├─2154748 erl_child_setup 1048576
             ├─2154837 sh -s disksup
             ├─2154838 /opt/arweave/lib/os_mon-2.7.1/priv/bin/memsup
             ├─2154839 /opt/arweave/lib/os_mon-2.7.1/priv/bin/cpu_sup
             ├─2154841 inet_gethost 4
             └─2154842 inet_gethost 4
ldmberman commented 2 years ago

Hi, timeouts are often temporary and recoverable, but failed_to_instantiate_rocksdb_iterator requires a restart. Could you give some more details about your setup (which disks and filesystems do you use, how many MB/s does the node download)?

ThomasBlock commented 2 years ago

Thank you for your fast reply! yes it recovers.. but now ten hours there is a similar error: ( LOCK: No locks available )

Aug 29 17:27:50 lauch start[2166672]: 2022-08-29T17:27:50.275401+02:00 [error] gen_server.erl:949 Generic server ar_chunk_storage terminating. Reason: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,[{file,"/root/arweave/apps/arweave/src/ar_chunk_storage.erl"},{line,140}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,721}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,750}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}. Last message: {put,90771323920630,<<236,228,47,223,15,180,216...
Aug 29 17:27:50 lauch start[2166672]: 2022-08-29T17:27:50.278265+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_chunk_storage:init/1, pid: <0.25695.198>, registered_name: ar_chunk_storage, error: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,[{file,"/root/arweave/apps/arweave/src/ar_chunk_storage.erl"},{line,140}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,721}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,750}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancest...
Aug 29 17:27:50 lauch start[2166672]: 2022-08-29T17:27:50.279380+02:00 [error] supervisor.erl:736 Supervisor: {local,ar_sup}. Context: child_terminated. Reason: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,[{file,"/root/arweave/apps/arweave/src/ar_chunk_storage.erl"},{line,140}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,721}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,750}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}. Offender: id=ar_chunk_storage,pid=<0.25695.198>.
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.620953+02:00 [error] : Error in process <0.11227.269> on node 'arweave@127.0.0.1' with exit value:, {badarg,[{rocksdb,iterator_move,[#Ref<0.3972060598.3831889989.231906>,{seek,<<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11,20,148,255,214,122>>}],[]},{ar_kv,get_next_by_prefix,4,[{file,"/root/arweave/apps/arweave/src/ar_kv.erl"},{line,77}]},{ar_data_sync,get_chunk,7,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1016}]},{ar_http_iface_middleware,handle_get_ch...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.621162+02:00 [error] : Ranch listener ar_http_iface_listener, connection process <0.13087.269>, stream 1 had its request process <0.21672.268> exit with reason badarg and stacktrace [{rocksdb,iterator_move,[#Ref<0.3972060598.3831889989.231906>,{seek,<<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11,20,148,255,214,122>>}],[]},{ar_kv,get_next_by_prefix,4,[{file,"/root/arweave/apps/arweave/src/ar_kv.erl"},{line,77}]},{ar_data_sync,get_chunk,7,[{file,"/root/arweave/apps/arweave/sr...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.620886+02:00 [error] gen_server.erl:949 Generic server ar_data_sync terminating. Reason: {{case_clause,{'EXIT',{{{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,[{...}|...]},{gen_server,try_handle_call,4,[...]},{gen_server,handle_msg,6,...},{proc_lib,init_p_do_apply,...}]},{gen_server,call,[ar_chunk_storage,{put,90771305308406,...}]}}}},[{ar_data_sync,store_chunk,3,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,2294}]},{ar_data_sync,handle_info,2,[{file,"...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.628795+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_data_sync:init/1, pid: <0.1209.0>, registered_name: ar_data_sync, error: {{case_clause,{'EXIT',{{{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,[{file,"/root/arweave/apps/arweave/src/ar_chunk_storage.erl"},{line,140}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,721}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,750}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.630469+02:00 [error] supervisor.erl:736 Supervisor: {local,ar_sup}. Context: child_terminated. Reason: {{case_clause,{'EXIT',{{{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,...},{gen_server,try_handle_call,...},{gen_server,...},{...}]},{gen_server,call,[ar_chunk_storage,{...}]}}}},[{ar_data_sync,store_chunk,3,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,2294}]},{ar_data_sync,handle_info,2,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.637107+02:00 [error] supervisor.erl:846 Supervisor: {local,ar_sup}. Context: start_error. Reason: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,232}]},{gen...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.637217+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_data_sync:init/1, pid: <0.9600.269>, registered_name: [], error: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_da...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.642980+02:00 [error] supervisor.erl:846 Supervisor: {local,ar_sup}. Context: start_error. Reason: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,232}]},{gen...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.643069+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_data_sync:init/1, pid: <0.19074.269>, registered_name: [], error: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_d...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.648916+02:00 [error] supervisor.erl:846 Supervisor: {local,ar_sup}. Context: start_error. Reason: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,232}]},{gen...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.649022+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_data_sync:init/1, pid: <0.4430.269>, registered_name: [], error: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_da...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.654200+02:00 [error] supervisor.erl:846 Supervisor: {local,ar_sup}. Context: start_error. Reason: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,232}]},{gen...
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.654566+02:00 [error] supervisor.erl:805 Supervisor: {local,ar_sup}. Context: shutdown. Reason: reached_max_restart_intensity. Offender: id=ar_data_sync,pid={restarting,<0.1209.0>}.
Aug 29 17:27:51 lauch start[2166672]: 2022-08-29T17:27:51.654293+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_data_sync:init/1, pid: <0.5702.269>, registered_name: [], error: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661771585 acquiring thread 140643696051776: /media/nvm6/weave/rocksdb/ar_data_sync_db/LOCK: No locks available"}}},[{ar_data_sync,init_kv,0,[{file,"/root/arweave/apps/arweave/src/ar_data_sync.erl"},{line,1256}]},{ar_data_sync,init,1,[{file,"/root/arweave/apps/arweave/src/ar_da...
Aug 29 17:27:52 lauch start[2166672]: 2022-08-29T17:27:52.765634+02:00 [error] proc_lib.erl:525 crasher: initial call: gun:proc_lib_hack/5, pid: <0.2654.269>, registered_name: [], error: {undef,[{erlang,get_stacktrace,[],[]},{gun,proc_lib_hack,5,[{file,"/root/arweave/_build/default/lib/gun/src/gun.erl"},{line,654}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [gun_sup,<0.750.0>], message_queue_len: 0, messages: [], links: [<0.751.0>], dictionary: [], trap_exit: false, status: running, heap_size: 610, stack_...
Aug 29 17:27:52 lauch start[2166672]: 2022-08-29T17:27:52.767259+02:00 [error] supervisor.erl:751 Supervisor: {local,gun_sup}. Context: child_terminated. Reason: {undef,[{erlang,get_stacktrace,[],[]},{gun,proc_lib_hack,5,[{file,"/root/arweave/_build/default/lib/gun/src/gun.erl"},{line,654}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}. Offender: id=gun,pid=<0.2654.269>.
Aug 29 17:28:32 lauch start[2166672]: 2022-08-29T17:28:32.146339+02:00 [error] supervisor.erl:905 Supervisor: {local,ar_sup}. Context: shutdown_error. Reason: killed. Offender: id=ar_bridge_sup,pid=<0.1162.0>.
Aug 29 17:28:32 lauch start[2166672]: 2022-08-29T17:28:32.146624+02:00 [error] gen_server.erl:949 Generic server ar_bridge terminating. Reason: killed. Last message: {'EXIT',<0.1162.0>,killed}. State: {state,{90988,{{{467,1005406},{221,3,133,83,1984},{<<"{\"new_block\":{\"usd_to_ar_rat"...>>,{block,<<92,13,121,6,...>>,<<214,159,162,...>>,1661771818,1661770320,115792089211204725463925141277326392909849545169910489760771025478983365302860,...},30008951316497}},{{{409,1005406},{125,73,124,48,1984},{<<"{\"new_block\":{\"usd_t"...>>,{block,<<"\\...
Aug 29 17:28:32 lauch start[2166672]: 2022-08-29T17:28:32.269405+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_bridge:init/1, pid: <0.1163.0>, registered_name: ar_bridge, exit: {killed,[{gen_server,decode_msg,9,[{file,"gen_server.erl"},{line,481}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [ar_bridge_sup,ar_sup,<0.800.0>], message_queue_len: 0, messages: [], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 833026, stack_size: 29, reductions: 61419816; neighbours: []
Aug 29 17:28:32 lauch start[2166672]: 2022-08-29T17:28:32.533476+02:00 [error] gen_server.erl:949 Generic server ar_block_propagation_worker8 terminating. Reason: killed. Last message: {'EXIT',<0.1162.0>,killed}. State: {state}.

and berfore there were hundreds of this new error messages today:

2022-08-29T11:48:19.847061+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 89488764477687
2022-08-29T11:48:19.847355+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 22596763372121

This node is my strongest, as i want to download the full weave. Dual EPYC CPU, 600 GB RAM, 3 TB Enterprise NVMe for data folder, ext4, Ubuntu22. only chunk_storage is symlinked to a nfs server ( HDD drives RAID6 as ext4 ). I have 100% blocks and 34 TB weave, 10000 peers.. the download never reaches more than 40MB/s .. so i get 1-2 TB/day .. this is not much considering that the system uses sometimes 20 cores, 300 GB RAM and has a Gigabit Connection. any ideas how to increase this are highly welcome..

on the other hand i guess i should not put too much stress on the HDDs? i often see these messages which i did not see on a ssd chunk_storage:

Aug 11 20:04:28 lauch nice[1086191]: 2022-08-11T20:04:28.454430+02:00 [error] /root/arweave/apps/arweave/src/ar_data_sync.erl:2312 event: failed_to_store_chunk, reason: timeout, absolute_chunk_offset: 49337222602998, data_root: <<"VJezFBIS1L524K4ECf77JY5wpTTqH3pLjCmmbnQKwyw">>
Aug 11 20:05:14 lauch nice[1086191]: 2022-08-11T20:05:14.838504+02:00 [error] /root/arweave/apps/arweave/src/ar_data_sync.erl:2312 event: failed_to_store_chunk, reason: timeout, absolute_chunk_offset: 49338902946038, data_root: <<"-7hgn-CktYE6MVrRXdofPJ0bmYHWSAykyHlV73QkC0U">>

my config:

"max_connections" : 4000,
"sync_jobs":600,
"header_sync_jobs":1,
"packing_rate":100

mining is false, fs.file-max = 4194304, vm.nr_hugepages=1250

ThomasBlock commented 2 years ago

update on this: one more error failed_to_instantiate_rocksdb_iterator / IO error: lock hold by current process but arweave restarted itself and is now working fine again. here some logs

2022-08-30T07:40:05.404606+02:00 [error] gen_server.erl:949 Generic server ar_chunk_storage terminating. Reason: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,...},{gen_server,...}|...]}. Last message: {put,46375135650038,<<214,238,143,17,216,197,211,240,150,138,...>>}. State: {state,#{89030393856000 => 
<<"8903"...>>,92283076608000 => <<...>>,...}}. Client ar_data_sync stacktrace: [{gen,do_call,4,[{...}|...]},{gen_server,call,2,[...]},{ar_chunk_storage,put,2,...},{ar_data_sync,write_not_bla...
2022-08-30T07:40:05.415574+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_chunk_storage:init/1, pid: <0.1208.0>, registered_name: ar_chunk_storage, error: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,[...]},{gen_server,try_handle_call,4,...},{gen_server,handle_msg,...},{proc_lib,...}]}, ance
stors: [ar_sup|...], message_queue_len: ..., messages: [{'$gen_call',...}], links: [<0.924.0>|...], dictionary: [{...}|...], trap_exit: true, status: running, heap_size: 121536, stack_size: ...
2022-08-30T07:40:05.425486+02:00 [error] supervisor.erl:736 Supervisor: {local,ar_sup}. Context: child_terminated. Reason: {{badmatch,{error,timeout}},[{ar_chunk_storage,handle_call,3,...},{gen_server,try_handle_call,...}|...]}. Offender: id=ar_chunk_storage,pid=<0.1208.0>.
2022-08-30T07:40:06.206464+02:00 [info] /root/arweave/apps/arweave/src/ar_data_sync.erl:931 event: terminate, reason: {{case_clause, {'EXIT', {{{badmatch,{error,timeout}}, [{ar_chunk_storage,handle_call,3, [{file,"/root/arweave/apps/arweave/src/ar_chunk_storage.erl"}, {line,140}]}, {gen_server,try_handle_call,4, [{fi
le,"gen_server.erl"},{line,721}]}, {gen_server,handle_msg,6,[{file,"gen_server.erl"},...
2022-08-30T07:40:06.577643+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 91035148263671
2022-08-30T07:40:06.589557+02:00 [warning] /root/arweave/apps/arweave/src/ar_kv.erl:71 event: failed_to_instantiate_rocksdb_iterator, offset: 91034420814071
2022-08-30T07:40:09.929928+02:00 [error] : Ranch listener ar_http_iface_listener, connection process <0.3240.243>, stream 1 had its request process <0.18675.243> exit with reason badarg and stacktrace [{rocksdb,get,[#Ref<0.2575953317.208797767.249678>,#Ref<0.2575953317.208797767.249683>,<<"{àð1"...>>,[]],[]},{ar_data
_sync,get_tx_data,5,[{file,...},{...}]},{ar_storage,read_tx_data_from_kv_storage,1,[{...}|...]},{ar_storage,read_tx2,1,[...]},{ar_http_iface_middleware,handle_get_tx,3,...},{ar_http_iface_mi...
2022-08-30T07:40:09.929800+02:00 [error] : Error in process <0.10035.243> on node 'arweave@127.0.0.1' with exit value:, {badarg,[{rocksdb,get,[#Ref<0.2575953317.208797767.249678>,#Ref<0.2575953317.208797767.249683>,<<"{àð13"...>>,[]],[]},{ar_data_sync,get_tx_data,5,[{file,[...]},{line,...}]},{ar_storage,read_tx_data_
from_kv_storage,1,[{file,...},{...}]},{ar_storage,read_tx2,1,[{...}|...]},{ar_http_iface_middleware,handle_get_tx,3,[...]},{ar_http_iface_middleware,handle,3,...},{ar_http_iface_middleware,'...
2022-08-30T07:40:10.887284+02:00 [error] supervisor.erl:846 Supervisor: {local,ar_sup}. Context: start_error. Reason: {{badmatch,{error,{db_open,...}}},[{ar_data_sync,init_kv,0,...},{ar_data_sync,init,...},{gen_server,...}|...]}. Offender: id=ar_data_sync,pid=<0.1211.0>.
2022-08-30T07:40:10.887490+02:00 [error] proc_lib.erl:525 crasher: initial call: ar_data_sync:init/1, pid: <0.24763.242>, registered_name: [], error: {{badmatch,{error,{db_open,"IO error: lock hold by current process, acquire time 1661823973 acquiring thread 139912577357376: /media/nvm6/weave/rocksdb/ar_data_sync_dat
a_root_i"...}}},[...]}, ancestors: [ar_sup|...], message_queue_len: ..., messages: [{...}|...], links: [<0.924.0>], dictionary: [], trap_exit: true, status: running, heap_size: 28690, stack_...