resilient data folder management / repacking chunk_storage

ThomasBlock commented 1 year ago

@ldmberman you gave useful feedback on ticket #384 and #385 , but did not further reply in the last 90 days.. on discord my questions are also overlooked. so maybe you can close that and i will rephrase the question:

What is the best way to have a resilient node with many blocks and many TB of the weave? This leads to two questions: a) how to download from the internet with maximum speed b) how to recover a node when it is broken ( e.g. when chunk_storage and rocksdb folder are out of sync, lots of error messages etc )

This is my summary for point a) - anything worth adding to the list?

strong computer ( CPU, RAM, Disk-Speed, Internet )
increase linux open file limit, ext4 large_dir
open port 1984 and static IP, well chosen peers
max_connections 10000
sync_jobs 1000
header_sync_jobs 100
disk_pool_jobs 100
disable packing ( low cpu usage - but you can not mine with the chunks )
requests_per_minute_limit 5000000 ( optimum upload speed )

For b) one could use Backups or Snapshots of the filesystem. ( shutdown Arweave, backup, restart ) But maybe even more effective is just to have at two nodes on separate machines. So i call this "cloning". As one machine should also be mining, i am looking for the most effective way to clone while also packing.

So at the moment i am using this additional parameters on the second machine:

linux: activate huge pages
enable randomx_jit / randomx_large_pages / randomx_hardware_aes
enable packing
packing_rate 240
randomx_bulk_hashing_iterations 40

My additional questions: c) As i have a Ryzen 9 5900X with 24 cores, there will be a warning at boot if i set packing_rate > 240 . ( "The number of cores on your machine (24) is not sufficient for packing 250 chunks per second." ) But both computers are only under 30% load. Can i increase this value?

d) I have 10 Gbit Ethernet between the nodes - but the cloning only takes place with constant 40 MByte/s ( packing_rate 240 should imply 60MByte/s i guess? ). So what might i missing, which other config value might limit the snyc speed?

e) why are two cloning nodes packing anyway? if i spotted that correctly in the sourececode, chunks in 2.5.3. are packed with a random key.. is there a way that the second node just keeps the packed original chunk, without repacking?

f) if i have only one node and rocksdb is completely broken, but i have 100 TB of packed data in chunk_storage.. can i somehow recover that data and import it to another data folder ? re-downloading the block headers can be done in 2 days.. but packing 100 TB is slow..

ldmberman commented 1 year ago

Hi @ThomasBlock,

There are a lot of improvements in the upcoming release (soon we'll publish the up-to-date version with fork switches disabled for now that you can already use on mainnet). Specifically, the storage will be split across modules of a configurable size. Every module syncs its own configured data range and has its own folder with its RocksDB and chunk_storage files. If something happens to a module and you cannot repair it, you can simply remove it from the config or point the node to a new module where it will re-sync the corresponding range to.

But both computers are only under 30% load. Can i increase this value?

Yes, you may ignore this warning, but the low load is probably not due to the specified rate being low.

I have 10 Gbit Ethernet between the nodes - but the cloning only takes place with constant 40 MByte/s

There is a default limit of 50 MB/s for chunks served to a single IP address that you might be hitting here. You can make two nodes sync faster from each other by increasing this limit in their configurations.

The config may look like

{"requests_per_minute_limit_by_ip":{"X.X.X.X":{"chunk":24000}}}

24000 (chunks/s) is double the default amount. Point the node to it via config_file config.json.

why are two cloning nodes packing anyway? if i spotted that correctly in the sourececode, chunks in 2.5.3. are packed with a random key.. is there a way that the second node just keeps the packed original chunk, without repacking?

Every chunk has its own packing key, but packed chunks are reusable across nodes. Your packing activity might originate from new chunks posted to your node. I would just wait and see how it works in the new version for you here.

if i have only one node and rocksdb is completely broken, but i have 100 TB of packed data in chunk_storage..

There is no good way to restore it unfortunately, although it is possible in theory. Also, note that after the hard fork a new packing is required anyway.

ThomasBlock commented 1 year ago

Hi @ldmberman . Thank you for the quick and extensive feedback. Great to hear about the new Version. I am also running the testnet - the module system seems very useful and is working so far.

Ah okay i added requests_per_minute_limit_by_ip, which should now be 200MB/s ( checked also syntax here https://github.com/ArweaveTeam/arweave/blob/6c5448df3923fa4012427fb07c31a38beb6108de/apps/arweave/test/ar_config_tests_config_fixture.json ).

Config file is read without error. But i am still only syncing with 40MB/s .. so can i check somehow the actual value limits? or do you have any other ideas?

"requests_per_minute_limit":1000000,
"requests_per_minute_limit_by_ip":{
"192.168.17.9":{"chunk":48000,"data_sync_record": 200000,"recent_hash_list_diff": 200000,"default":200000},
"192.168.178.109":{"chunk":48000,"data_sync_record": 200000,"recent_hash_list_diff": 200000,"default":200000}
},

left: receiver ( no internet ). right: sender ( cpu spikes = download from internet ) stats

receiver: speed

ldmberman commented 1 year ago

The same limit is also configured on the serving side so you are likely hitting it. Ideally we should implement some form of explicit control of where to sync the chunks from.

ThomasBlock commented 1 year ago

Yes i was only speaking of the serving side. but also increased the client side. So here again my setup:

my "receiver" is a machine with no IP gateway, so no internet access. ip: 192.168.17.9
i have one sender in the same lan subnet: 192.168.17.1
the receiver has only one peer: 192.168.17.1
the receiver 192.168.17.9 can boot very fast from 192.168.17.1 and download with 40 MB/s
no other nodes are involved

each machine has the limit increased for the other machine:

config of 192.168.17.1 :
"requests_per_minute_limit":1000000,
"requests_per_minute_limit_by_ip":{
"192.168.17.9":{"chunk":48000,"data_sync_record": 200000,"recent_hash_list_diff": 200000,"default":200000},
},

config of 192.168.17.9 :
"peers": ["192.168.17.1:1984"],
"requests_per_minute_limit":1000000,
"requests_per_minute_limit_by_ip":{
"192.168.17.1":{"chunk":48000,"data_sync_record": 200000,"recent_hash_list_diff": 200000,"default":200000},
},

so there is still either a upload limit on 192.168.17.1 or a download limit on 192.168.17.9

any ideas @ldmberman ?

ThomasBlock commented 1 year ago

@ldmberman could you please look into this?

i did again a test, now with three nodes. findings:

Node C only downloads from ONE other node, as you said - either Node A or Node B.
each Download is max 40MB/s, regardless of requests_per_minute_limit_by_ip
it is possible that Node B AND node Node C download from Node A. So we have still 40MB/s download limit - but Node A can do 80 MB/s upload

so either requests_per_minute_limit_by_ip is ignored ( how could i check that? ) or there might be another barrier?

40MB/s = 320 MBit/s in picture

ArweaveTeam / arweave

resilient data folder management / repacking chunk_storage #390