Chia-Network / bladebit

A high-performance k32-only, Chia (XCH) plotter supporting in-RAM and disk-based plotting
Apache License 2.0
339 stars 109 forks source link

Crashes #359

Closed ukd1 closed 11 months ago

ukd1 commented 11 months ago

I get crashes with both a downloaded version, and if I compile c669876eff68a29c679e3ff501a3ab656adf30b9 too.

./bladebit -t 24 -f xxx -c xxx diskplot --cache 4G -t1 /chia/fast/ /chia/plots1/ 
....
....
....
Completed table 6 in 930.30 seconds with 1622601350 entries.
Table 6 I/O wait time: 428.22 seconds.
 Table 6 I/O Metrics:
  Average read throughput 102.79 MiB ( 107.78 MB ) or 0.10 GiB ( 0.11 GB ).
  Total size read: 85099.50 MiB ( 89233.29 MB ) or 83.10 GiB ( 89.23 GB ).
  Total read commands: 196608.
  Average write throughput 760.78 MiB ( 797.73 MB ) or 0.74 GiB ( 0.80 GB ).
  Total size written: 73551.12 MiB ( 77123.94 MB ) or 71.83 GiB ( 77.12 GB ).
  Total write commands: 42313.

Table 7
 Sorting      : Completed in 18.05 seconds.
 Distribution : Completed in 15.51 seconds.
 Matching     : Completed in 19.03 seconds.
 Fx           : Completed in 13.49 seconds.
Completed table 7 in 529.87 seconds with 612971082 entries.
Table 7 I/O wait time: 373.43 seconds.
 Table 7 I/O Metrics:
  Average read throughput 102.48 MiB ( 107.46 MB ) or 0.10 GiB ( 0.11 GB ).
  Total size read: 49284.75 MiB ( 51678.81 MB ) or 48.13 GiB ( 51.68 GB ).
  Total read commands: 196608.
  Average write throughput 784.97 MiB ( 823.10 MB ) or 0.77 GiB ( 0.82 GB ).
  Total size written: 34367.62 MiB ( 36037.07 MB ) or 33.56 GiB ( 36.04 GB ).
  Total write commands: 25602.

Sorting F7 & Writing C Tables
Completed F7 tables in 296.29 seconds.
F7/C Tables I/O wait time: 287.01 seconds.
Finished Phase 1 in 5285.65 seconds ( 88.1 minutes ).
Running Phase 2
Finished marking table 6 in 6.81 seconds.
Table 6 I/O wait time: 0.00 seconds.
*** Crashed! ***
./bladebit(_Z12CrashHandleri+0xaa)[0x56028c9af13a]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f127d161520]
./bladebit(_ZNSt17_Function_handlerIFvP9AnonMTJobEZN14DiskPlotPhase216MarkTableBucketsIL7TableId5ELj256ELb1EEEv20DiskPairAndMapReaderIXT0_EXT1_EEP4PairPm8BitFieldSB_EUlS1_E_E9_M_invokeERKSt9_Any_dataOS1_+0x95)[0x56028ca3eef5]
./bladebit(_ZN11MTJobRunnerI9AnonMTJobLj256EE13RunJobWrapperEPS0_+0x41)[0x56028c9ad991]
./bladebit(_ZN10ThreadPool17FixedThreadRunnerEPv+0x52)[0x56028c99f862]
./bladebit(_ZN6Thread17ThreadStarterUnixEPS_+0x76)[0x56028c9b0176]
/lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7f127d1b3b43]
/lib/x86_64-linux-gnu/libc.so.6(+0x126a00)[0x7f127d245a00]
Dumping crash to crash.log
ukd1 commented 11 months ago

I have 64g of ram, and 573g of free nvme space.

It seems to crash every time, e.g.


Table 6
 Sorting      : Completed in 99.45 seconds.
 Distribution : Completed in 48.67 seconds.
 Matching     : Completed in 41.57 seconds.
 Fx           : Completed in 44.69 seconds.
Completed table 6 in 1203.04 seconds with 1624990720 entries.
Table 6 I/O wait time: 377.90 seconds.
 Table 6 I/O Metrics:
  Average read throughput 80.94 MiB ( 84.87 MB ) or 0.08 GiB ( 0.08 GB ).
  Total size read: 85144.62 MiB ( 89280.61 MB ) or 83.15 GiB ( 89.28 GB ).
  Total read commands: 196608.
  Average write throughput 628.45 MiB ( 658.98 MB ) or 0.61 GiB ( 0.66 GB ).
  Total size written: 73602.25 MiB ( 77177.55 MB ) or 71.88 GiB ( 77.18 GB ).
  Total write commands: 42317.

Table 7
 Sorting      : Completed in 69.55 seconds.
 Distribution : Completed in 15.60 seconds.
 Matching     : Completed in 26.62 seconds.
 Fx           : Completed in 16.14 seconds.
Completed table 7 in 644.87 seconds with 614812621 entries.
Table 7 I/O wait time: 318.32 seconds.
 Table 7 I/O Metrics:
  Average read throughput 88.64 MiB ( 92.94 MB ) or 0.09 GiB ( 0.09 GB ).
  Total size read: 49310.25 MiB ( 51705.54 MB ) or 48.15 GiB ( 51.71 GB ).
  Total read commands: 196608.
  Average write throughput 621.78 MiB ( 651.98 MB ) or 0.61 GiB ( 0.65 GB ).
  Total size written: 34395.25 MiB ( 36066.03 MB ) or 33.59 GiB ( 36.07 GB ).
  Total write commands: 25602.

Sorting F7 & Writing C Tables
*** Crashed! ***
./bladebit(_Z12CrashHandleri+0xaa)[0x55bd97a8291a]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f44ae9a5520]
./bladebit(+0x68a33)[0x55bd97a70a33]
./bladebit(_ZN11MTJobRunnerIN11TableWriter5C3JobELj256EE13RunJobWrapperEPS1_+0x87)[0x55bd97a90067]
./bladebit(_ZN10ThreadPool17FixedThreadRunnerEPv+0x52)[0x55bd97bf86f2]
./bladebit(_ZN6Thread17ThreadStarterUnixEPS_+0x80)[0x55bd97a839d0]
/lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7f44ae9f7b43]
/lib/x86_64-linux-gnu/libc.so.6(+0x126a00)[0x7f44aea89a00]
Dumping crash to crash.log
ukd1 commented 11 months ago

FYI I switched filesystems for /chia/fast from zfs to xfs and it hasn't crashed since. I left the /chia/plots1 as zfs

./bladebit -t 24 -f xxx -c xxx diskplot --cache 4G -t1 /chia/fast/ /chia/plots1/ 

Also, this is on:

jmhands commented 11 months ago

ok thanks, closing. ZFS is not recommended for ephemeral storage use case like plotting, XFS is much better choice