Chia-Network / chia-blockchain

Chia blockchain python implementation (full node, farmer, harvester, timelord, and wallet)
Apache License 2.0
10.82k stars 2.03k forks source link

[Bug] GRResult error occuring a couple times a day farming a few PB of C2 compressed plots using an Nvidia P4 GPU - Bladebit #15404

Closed chain-enterprises closed 3 months ago

chain-enterprises commented 1 year ago

What happened?

When the system (ProLiant DL360 Gen9, dual E5-2620 v4, 32 gigs ram, Nvidia P4, 75k C2 plots) hits a high IO load on the same block device as the Chia Full Node DB, shortly after the debug.log in chia will show GRResult not ok. The number of plots, lookup times, all seems fine - but the harvester stops finding proofs until the harvester is restarted. Happens 1-2 times in a 24 hour period on Alpha 4 through Alpha 4.3

Whenever error occurs, block validation time and lookup time consistently increase leading up to the error being thrown.

Reproducible with Nvidia Unix GPU Driver versions 530.30.03, 530.41.03, and 535.43.02

Version

2.0.0b3.dev56

What platform are you using?

Ubuntu 22.04 Linux Kernel 5.15.0-73-generic ProLiant DL360 Gen9, dual E5-2620 v4, 32 gigs ram, Nvidia P4, 75k C2 plots

What ui mode are you using?

CLI

Relevant log output

023-05-29T20:45:32.552 full_node chia.full_node.mempool_manager: WARNING  pre_validate_spendbundle took 2.0414 seconds for xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2023-05-29T20:45:42.620 full_node chia.full_node.mempool_manager: WARNING  add_spendbundle xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx took 10.06 seconds. Cost: 2924758101 (26.589% of max block cost)
2023-05-29T20:45:56.840 full_node chia.full_node.full_node: WARNING  Block validation time: 2.82 seconds, pre_validation time: 2.81 seconds, cost: None header_hash: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx height: 3732042
2023-05-29T20:46:57.239 full_node chia.full_node.full_node: WARNING  Block validation time: 3.34 seconds, pre_validation time: 0.42 seconds, cost: 3165259860, percent full: 28.775% header_hash: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx height: 3732044
2023-05-29T20:49:26.913 full_node chia.full_node.full_node: WARNING  Block validation time: 2.40 seconds, pre_validation time: 0.49 seconds, cost: 2041855544, percent full: 18.562% header_hash: 8d0ce076a3270a0c8c9c8d1f0e73c9b5b884618ee34020d2a4f3ffafa459cfd0 height: 3732055
2023-05-29T20:51:06.259 full_node full_node_server        : WARNING  Banning 89.58.33.71 for 10 seconds
2023-05-29T20:51:06.260 full_node full_node_server        : WARNING  Invalid handshake with peer. Maybe the peer is running old software.
2023-05-29T20:51:27.986 harvester chia.harvester.harvester: ERROR    Exception fetching full proof for /media/chia/hdd23/plot-k32-c02-2023-04-23-someplot.plot. GRResult is not GRResult_OK.
2023-05-29T20:51:28.025 harvester chia.harvester.harvester: ERROR    File: /media/chia/hdd23/someplot.plot Plot ID: someplotID, challenge: 7b5b6f11ec2a86a7298cb55b7db8a016a775efea221104b37905366b49f2e2bd, plot_info: PlotInfo(prover=<chiapos.DiskProver object at 0x7f3544998f30>, pool_public_key=None, pool_contract_puzzle_hash=<bytes32: contractHash>, plot_public_key=<G1Element PlotPubKey>, file_size=92374601728, time_modified=1682261996.8218756)
2023-05-29T20:51:57.482 full_node chia.full_node.full_node: WARNING  Block validation time: 10.23 seconds, pre_validation time: 0.29 seconds, cost: 959315244, percent full: 8.721% header_hash: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx height: 3732059
2023-05-29T20:55:24.640 full_node chia.full_node.full_node: WARNING  Block validation time: 3.18 seconds, pre_validation time: 0.26 seconds, cost: 2282149756, percent full: 20.747% header_hash: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx height: 3732067
2023-05-29T20:56:01.825 wallet wallet_server              : WARNING  Banning 95.54.100.118 for 10 seconds
2023-05-29T20:56:01.827 wallet wallet_server              : ERROR    Exception Invalid version: '1.6.2-sweet', exception Stack: Traceback (most recent call last):
  File "chia/server/server.py", line 483, in start_client
  File "chia/server/ws_connection.py", line 222, in perform_handshake
  File "packaging/version.py", line 198, in __init__
packaging.version.InvalidVersion: Invalid version: '1.6.2-sweet'
Jahorse commented 3 months ago

What new plot format? I don't see anything about it in this repo. Some better communication would be appreciated. Why would a new format fix the problem?