kaspanet / rusty-kaspa

Kaspa full-node and related libraries in the Rust programming language. This is a Beta version at the final testing phases.
ISC License
350 stars 105 forks source link

Investigate and document hardware setups that optimize disk write and lifespan #441

Open coderofstuff opened 1 month ago

coderofstuff commented 1 month ago

Higher BPS will require significantly higher amounts of writes to storage mediums. We should document ways to optimize node setup so as to maximize disk lifespans and performance when we go to higher bps.

callid0n commented 1 month ago

Been running a comparison for 4 days 23.5 hours and still seeing some nice disk write savings. These were 2 separate VM's on a single Windows 11 Client Hyper-V host. Both VM's running Windows 11 Pro, with 56 GB RAM, 6 virtual CPU's from 12th Gen Intel(R) Core(TM) i5-12600K. Was using PrimoCache software on both with a 36 GB L1 RAM Cache to minimize wear on the underlying SSD, which is a Samsung SSD 980 PRO 2TB.

here's the information from node running latest:

Pruning times: 2024-03-20 20:38:21.829-05:00 [INFO ] Starting Header and Block pruning... 2024-03-20 22:06:12.328-05:00 [INFO ] Header and Block pruning completed: traversed: 491080, pruned 451994

2024-03-21 08:29:34.966-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 09:56:16.567-05:00 [INFO ] Header and Block pruning completed: traversed: 486725, pruned 447496

2024-03-21 20:24:23.978-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 21:55:42.432-05:00 [INFO ] Header and Block pruning completed: traversed: 476662, pruned 437246

2024-03-22 08:32:00.561-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 10:03:31.269-05:00 [INFO ] Header and Block pruning completed: traversed: 474666, pruned 435721

2024-03-22 20:38:30.297-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 22:12:27.052-05:00 [INFO ] Header and Block pruning completed: traversed: 473133, pruned 434034

2024-03-23 08:33:38.980-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 10:08:09.169-05:00 [INFO ] Header and Block pruning completed: traversed: 478005, pruned 438769

2024-03-23 20:22:44.132-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 22:03:44.303-05:00 [INFO ] Header and Block pruning completed: traversed: 498908, pruned 459764

2024-03-24 08:14:37.256-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 09:49:12.724-05:00 [INFO ] Header and Block pruning completed: traversed: 474671, pruned 435798

2024-03-24 20:07:41.664-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 21:41:01.881-05:00 [INFO ] Header and Block pruning completed: traversed: 472693, pruned 433506

2024-03-25 07:58:44.050-05:00 [INFO ] Starting Header and Block pruning... 2024-03-25 09:36:10.700-05:00 [INFO ] Header and Block pruning completed: traversed: 473140, pruned 434227

Performance metrics: 2024-03-25 15:25:53.078-05:00 [TRACE] [perf-monitor] process metrics: RAM: 9406656512 (9.41GB), VIRT: 19268296704 (19.27GB), FD: 4423, cores: 6, total cpu usage: 3.4575 2024-03-25 15:26:03.083-05:00 [TRACE] [perf-monitor] disk io metrics: read: 17984155275801 (18TB), write: 7090310564916 (7TB), read rate: 11818591.963 (12MB/s), write rate: 2565557.277 (3MB/s)

And here's the data from optimization branch: https://github.com/biryukovmaxim/rusty-kaspa/tree/rocksdb-optimizations

Pruning times:

2024-03-20 20:50:50.188-05:00 [INFO ] Starting Header and Block pruning... 2024-03-20 22:55:17.354-05:00 [INFO ] Header and Block pruning completed: traversed: 491080, pruned 451994

2024-03-21 08:31:58.413-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 10:41:09.741-05:00 [INFO ] Header and Block pruning completed: traversed: 486725, pruned 447496

2024-03-21 20:26:27.425-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 22:33:46.521-05:00 [INFO ] Header and Block pruning completed: traversed: 476662, pruned 437246

2024-03-22 08:33:24.643-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 11:16:26.402-05:00 [INFO ] Header and Block pruning completed: traversed: 474666, pruned 435721

2024-03-22 20:40:42.267-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 22:52:51.090-05:00 [INFO ] Header and Block pruning completed: traversed: 473133, pruned 434034

2024-03-23 08:36:17.572-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 10:47:14.354-05:00 [INFO ] Header and Block pruning completed: traversed: 478005, pruned 438769

2024-03-23 20:25:07.041-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 22:44:36.901-05:00 [INFO ] Header and Block pruning completed: traversed: 498908, pruned 459764

2024-03-24 08:15:37.933-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 10:28:23.155-05:00 [INFO ] Header and Block pruning completed: traversed: 474671, pruned 435798

2024-03-24 20:09:47.030-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 22:20:14.319-05:00 [INFO ] Header and Block pruning completed: traversed: 472693, pruned 433506

2024-03-25 07:59:28.877-05:00 [INFO ] Starting Header and Block pruning... 2024-03-25 10:15:44.295-05:00 [INFO ] Header and Block pruning completed: traversed: 473140, pruned 434227

Performance metrics: 2024-03-25 15:25:45.586-05:00 [TRACE] [perf-monitor] process metrics: RAM: 9713639424 (9.71GB), VIRT: 24021032960 (24.02GB), FD: 3282, cores: 6, total cpu usage: 3.5519 2024-03-25 15:25:45.586-05:00 [TRACE] [perf-monitor] disk io metrics: read: 18964146280014 (19TB), write: 3772184849200 (4TB), read rate: 12274983.905 (12MB/s), write rate: 1175046.086 (1MB/s)

callid0n commented 1 month ago

I'm fully synced (running archival node) on TN11 using the new 256KB rocksdb block size. Going to let it run for a while to see if it falls out of sync or anything strange happens.

I currently have 5 7200 RPM HDD's configured in a Windows Parity Storage Space (essentially RAID 5) using the Powershell below.

New-VirtualDisk -StoragePoolFriendlyName PoolName -FriendlyName vDiskNameHere -ProvisioningType Fixed -ResiliencySettingName Parity -UseMaximumSize -NumberOfColumns 5 -Interleave 256KB|Initialize-Disk -PartitionStyle GPT -PassThru |New-Partition -DriveLetter k -UseMaximumSize |Format-Volume -FileSystem NTFS -NewFileSystemLabel "KasData" -AllocationUnitSize 1024KB -UseLargeFRS -Confirm:$false

So 256KB interleave and 1024KB Windows NTFS Allocation Unit Size Using PrimoCache to create a 5GB RAM write cache and a 50GB SSD Read cache.

and here is the code ChatGPT gave me to alter the file "\database\src\db\conn_builder.rs" to get the 256KB rocksdb block size.


macro_rules! default_opts {
    ($self: expr) => {{
        let mut opts = rocksdb::Options::default();
        if $self.parallelism > 1 {
            opts.increase_parallelism($self.parallelism as i32);
        }

        opts.optimize_level_style_compaction($self.mem_budget);
        let guard = kaspa_utils::fd_budget::acquire_guard($self.files_limit)?;

        // Create BlockBasedOptions and set block size
        let mut block_opts = rocksdb::BlockBasedOptions::default();
        block_opts.set_block_size(256 * 1024);
        opts.set_block_based_table_factory(&block_opts);

        opts.set_max_open_files($self.files_limit);
        opts.create_if_missing($self.create_if_missing);
        Ok((opts, guard))
    }};
}

https://discord.com/channels/599153230659846165/755890250643144788/1223301320769798297

callid0n commented 1 month ago

This config fell out of sync once the underlying size of data sourced from the spinning disks was too large. Still hunting for a spinning disk archival config that works. https://github.com/kaspanet/rusty-kaspa/issues/441#issuecomment-2027512540