Chia-Network / bladebit

A high-performance k32-only, Chia (XCH) plotter supporting in-RAM and disk-based plotting
Apache License 2.0
340 stars 107 forks source link

cudaErrorMemoryAllocation : out of memory in 16G mode #454

Open rbd3453 opened 5 months ago

rbd3453 commented 5 months ago

Figured I'd play around with the 16G plotting mode, but I can't seem to get it to start. I'm convinced I'm doing something really basic incorrectly, as it just doesn't seem to be triggering the 16G mode properly ("Total Host RAM required: 213.25Gib" for example). Thanks for any tips!

./bladebit_cuda -z 1 -f 99... -c xc... cudaplot --disk-16 -t1 /media/user/2250596350593EAB/temp

Bladebit Chia Plotter
Version      : 3.1.0-beta1
Git Commit   : 7e7d52831fe54192a0f5b52ef9d6e60a80010e93
Compiled With: gcc 9.4.0

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : 99...
 Pool contract address : xc...
 Compression Level     : 1
 Benchmark mode        : disabled

[Bladebit CUDA Plotter]
Selected cuda device 0 : NVIDIA GeForce RTX 3070 Ti
 CUDA Compute Capability   : 8.6
 SM count                  : 48
 Max blocks per SM         : 16
 Max threads per SM        : 1536
 Async Engine Count        : 2
 L2 cache size             : 4.00 MB
 L2 persist cache max size : 3.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 7.77 GB
  Free                     : 6.91 GB

Allocating buffers (this may take a few seconds)...
Host Temp @ 81 GiB
Host Tables B @ 0 GiB
Host Tables A @ 132 GiB
Kernel RAM required       : 87241596928  bytes ( 83200.07  MiB or 81.25  GiB )
Intermediate RAM required : 73728        bytes ( 0.07      MiB or 0.00   GiB )
Host RAM required         : 141733920768 bytes ( 135168.00 MiB or 132.00 GiB )
Total Host RAM required   : 228975517696 bytes ( 218368.07 MiB or 213.25 GiB )
GPU RAM required          : 6140243968   bytes ( 5855.79   MiB or 5.72   GiB )
Allocating buffers
CUDA error: 2 (0x2 ) cudaErrorMemoryAllocation : out of memory

*** Panic!!! *** Fatal Error:  
CUDA error cudaErrorMemoryAllocation : out of memory.
./bladebit_cuda(_ZN7SysHost14DumpStackTraceEv+0x5b)[0x55df81ddcacb]
./bladebit_cuda(_Z9PanicExitv+0xf)[0x55df81f6985f]
./bladebit_cuda(+0x7f864)[0x55df81d84864]
./bladebit_cuda(main+0x911)[0x55df81d7f651]
/lib/x86_64-linux-gnu/libc.so.6(+0x28150)[0x7f6ced828150]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x89)[0x7f6ced828209]
./bladebit_cuda(_start+0x2e)[0x55df81d80fbe]
haorldbchi commented 5 months ago

Your CLI doesn't seem like it has anything wrong with it... But it looks like you might have an old build. The version says beta1. You could try assets from the release pages in this repo, or from the latest CI: https://github.com/Chia-Network/bladebit/actions/runs/7146753124