fx26980 / subspace-farmer

Support GPU plotting, Achieve an order-of-magnitude performance improvement, Maximizing your plotting potential.
12 stars 2 forks source link

farmer stops plotting without notice #3

Open tgarm opened 5 months ago

tgarm commented 5 months ago

When I run the gpu farmer, it keeps plotting and then stop plotting after a couple of hours. The output looks like this:

2024-04-19 09:56:10.766  INFO {farm_index=5}: subspace_farmer::single_disk_farm::plotting: Plotting sector (75.06% complete) sector_index=5244
2024-04-19 09:56:31.250  INFO {farm_index=1}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.64% complete) sector_index=5191
2024-04-19 09:57:05.213  INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.85% complete) sector_index=5206
2024-04-19 09:58:44.113  INFO {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.12% complete) sector_index=5154
2024-04-19 09:59:16.234  INFO {farm_index=2}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.44% complete) sector_index=5248
2024-04-19 09:59:28.522  INFO {farm_index=4}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.75% complete) sector_index=5270
2024-04-19 09:59:36.255  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.51% complete) sector_index=5253
2024-04-19 09:59:51.538  INFO {farm_index=6}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.60% complete) sector_index=5188
2024-04-19 09:59:57.710  INFO {farm_index=5}: subspace_farmer::single_disk_farm::plotting: Plotting sector (75.08% complete) sector_index=5245
2024-04-19 10:00:26.783  INFO {farm_index=1}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.66% complete) sector_index=5192
2024-04-19 10:00:47.917  INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.87% complete) sector_index=5207
2024-04-19 10:02:22.723  INFO {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.14% complete) sector_index=5155
2024-04-19 10:02:58.650  INFO {farm_index=2}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.45% complete) sector_index=5249
2024-04-19 10:03:09.045  INFO {farm_index=4}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.76% complete) sector_index=5271
2024-04-19 10:03:14.441  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.52% complete) sector_index=5254
2024-04-19 10:03:24.938  INFO {farm_index=6}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.61% complete) sector_index=5189
2024-04-19 10:03:28.115  INFO {farm_index=5}: subspace_farmer::single_disk_farm::plotting: Plotting sector (75.09% complete) sector_index=5246
2024-04-19 10:03:40.343  INFO {farm_index=1}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.67% complete) sector_index=5193
2024-04-19 10:03:47.090  INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.88% complete) sector_index=5208
2024-04-19 10:04:11.932  INFO {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.15% complete) sector_index=5156
2024-04-19 11:05:48.540  INFO {farm_index=6}: subspace_farmer::reward_signing: Successfully signed reward hash 0x

It keeps plotting before 10:04, and then stopped without notice. Anyway to debug?

I'm using Nvidia RTX 3070 8GB, it should be supported.

fx26980 commented 5 months ago

Observe the usage of cpu, disk and network, or existence other special operations?

tgarm commented 5 months ago

CPU usage is pretty low. Disk read is pretty high (over 1GByte/s). Perhaps it's caused by disk error. I checked dmesg:

[ 8134.116477] nvme 0000:62:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x004c address=0x8e770700 flags=0x0020]
[ 8991.410679] nvme 0000:61:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x004b address=0x9d68e900 flags=0x0020]
[ 9097.497416] nvme 0000:42:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0x847bbf00 flags=0x0020]
[13089.365029] nvme 0000:61:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x004b address=0xb2bdca00 flags=0x0020]
[15650.999147] nvme 0000:62:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x004c address=0x5e4dca00 flags=0x0020]

Something like this. But it keeps plotting when I use official farmer.

tgarm commented 4 months ago

I tested it for longer time. It seems not stopped, just paused:

2024-04-19 22:55:24.072  INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.14% complete) sector_index=5256
2024-04-19 22:57:06.501  INFO {farm_index=2}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.73% complete) sector_index=5298
2024-04-19 22:57:50.751  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (74.79% complete) sector_index=5302
2024-04-19 22:58:09.167  INFO {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.42% complete) sector_index=5204
2024-04-19 22:58:15.147  INFO {farm_index=5}: subspace_farmer::single_disk_farm::plotting: Plotting sector (75.31% complete) sector_index=5294
2024-04-19 22:58:18.447  INFO {farm_index=1}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.93% complete) sector_index=5241
2024-04-19 22:58:21.898  INFO {farm_index=6}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.88% complete) sector_index=5237
2024-04-19 22:58:35.504  INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.16% complete) sector_index=5257
2024-04-19 22:58:35.603  INFO {farm_index=4}: subspace_farmer::single_disk_farm::plotting: Plotting sector (74.02% complete) sector_index=5319
2024-04-19 22:58:54.752  INFO {farm_index=2}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.74% complete) sector_index=5299
2024-04-19 23:01:28.398  INFO {farm_index=1}: subspace_farmer::reward_signing: Successfully signed reward hash 0xa3c6b269bbabb956747462629e314fbf385d60950d255e64161991fe2aaa4
b68
2024-04-19 23:42:18.757  INFO {farm_index=1}: subspace_farmer::reward_signing: Successfully signed reward hash 0xe5ee185750e9d1c7d31b02e5f0f2c5333b88c896f8c2526dea299d6e1e599
678
2024-04-19 23:48:18.816  INFO {farm_index=5}: subspace_farmer::reward_signing: Successfully signed reward hash 0x9dd824554238203c0bfca538518aa298ab3793dc5f27e167f8181eb60c8aa
a3e
2024-04-20 00:16:56.747  INFO {farm_index=2}: subspace_farmer::reward_signing: Successfully signed reward hash 0x347ae9c0bad44c6cb9f6d8db899d5e90b0272a3e2d8228e94b5e3cc0edcce
1e8
2024-04-20 01:53:55.606  INFO {farm_index=3}: subspace_farmer::reward_signing: Successfully signed reward hash 0x7db27ee6c55375c0b4e6cf8d360ed146bfa50b1c1da14e892515bbe8086ca403
2024-04-20 02:25:04.533  INFO {farm_index=6}: subspace_farmer::reward_signing: Successfully signed reward hash 0xd6b347d33b370ef86ee938a3d0fbd9e78beffbd2ccae709625cb25b7d4c48e98
2024-04-20 04:23:40.780  INFO {farm_index=7}: subspace_farmer::reward_signing: Successfully signed reward hash 0x187c61acb1cca5c4e16bb361e70b5d89f8d46e640dd871c1dc7645d15d708ebc
2024-04-20 06:31:01.368  INFO {farm_index=4}: subspace_farmer::reward_signing: Successfully signed reward hash 0x39cf2da7e496c4b4c346f5a0b58004e1e2569663876cb04cee8f22ea4691fc0c
2024-04-20 07:13:30.224  INFO {farm_index=1}: subspace_farmer::reward_signing: Successfully signed reward hash 0x887418da99a300ffb03b08dfc965808d38d13ec5a8bc5de474135218e6203a37
2024-04-20 08:09:21.616  INFO {farm_index=2}: subspace_farmer::reward_signing: Successfully signed reward hash 0x41c9415f1247e7f0b0e23ac16c6fef2fb5b05ba8e433bda7558d84ad66a59948
2024-04-20 08:13:30.973  INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.17% complete) sector_index=5258
2024-04-20 08:13:31.065  INFO {farm_index=5}: subspace_farmer::single_disk_farm::plotting: Plotting sector (75.32% complete) sector_index=5295
2024-04-20 08:13:31.141  INFO {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.43% complete) sector_index=5205
2024-04-20 08:13:31.390  INFO {farm_index=4}: subspace_farmer::single_disk_farm::plotting: Plotting sector (74.03% complete) sector_index=5320
2024-04-20 08:13:31.662  INFO {farm_index=1}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.95% complete) sector_index=5242
2024-04-20 08:13:34.058  INFO {farm_index=6}: subspace_farmer::single_disk_farm::plotting: Plotting sector (72.89% complete) sector_index=5238
2024-04-20 08:13:34.125  INFO {farm_index=2}: subspace_farmer::single_disk_farm::plotting: Plotting sector (73.75% complete) sector_index=5300
2024-04-20 08:13:34.391  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (74.81% complete) sector_index=5303
2024-04-20 08:16:56.223  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (74.82% complete) sector_index=5304

We can see it pause plotting for around 8 hours.