Chia-Network / bladebit

A high-performance k32-only, Chia (XCH) plotter supporting in-RAM and disk-based plotting
Apache License 2.0
336 stars 109 forks source link

Cuda plotter issue - Tesla P4 #292

Closed BlackLabelActual closed 1 year ago

BlackLabelActual commented 1 year ago

Using the tar provided for prebuilt, perhaps thats where the issue is but I cannot build the git and still run CUDA for some reason.

Generating F1
Finished F1 in 5.38 seconds.
Table 2 completed in 21.08 seconds with 4294931377 entries.
Table 3 completed in 30.39 seconds with 4294853382 entries.
Table 4 completed in 34.75 seconds with 4294723661 entries.
Table 5 completed in 34.71 seconds with 4294585294 entries.
Table 6 completed in 33.90 seconds with 4294240281 entries.
Table 7 completed in 32.22 seconds with 4293576938 entries.
Finalizing Table 7
Finalized Table 7 in 15.19 seconds.
Completed Phase 1 in 207.63 seconds
Marked Table 6 in 17.54 seconds.
Marked Table 5 in 15.00 seconds.
Marked Table 4 in 14.27 seconds.
Marked Table 3 in 14.01 seconds.
Marked Table 2 in 13.90 seconds.
Completed Phase 2 in 74.73 seconds
Compressing Table 1 and 2...
 Step 1 completed step in 9.00 seconds.

*** Panic!!! *** Fatal Error:  
Failed to write to plot with error 27:
./bladebit_cuda(+0xcf8cb)[0x5624630388cb]
./bladebit_cuda(+0xcf0af)[0x5624630380af]
./bladebit_cuda(+0xbdb5e)[0x562463026b5e]
./bladebit_cuda(+0xbe510)[0x562463027510]
./bladebit_cuda(+0xd062d)[0x56246303962d]
/lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7fbffc694b43]
/lib/x86_64-linux-gnu/libc.so.6(+0x126a00)[0x7fbffc726a00]
CharlieTemplar commented 1 year ago

I think going by the syslog, this issue might have caused or be caused by an unhandled ECC memory error. These P4s have 8Gb vram but "usable FB size is reduced due to ECC" Hope this can be fixed, this is a great little card.

harold-b commented 1 year ago

Potentially duplicate of #278. Leaving open for now so we can validate if it is related to the UploadArray bug.

CharlieTemplar commented 1 year ago

P4 looking great with cudaplot right now. 6mins plot time + some delay at the beginning and end of each plot. total time still around 10 minutes. Just got to work out how to offload the plots to a farmer. Thanks for fixing.

harold-b commented 1 year ago

Closing as it appears the fix was successful as confirmed above