Chia-Network / bladebit

A high-performance k32-only, Chia (XCH) plotter supporting in-RAM and disk-based plotting
Apache License 2.0
340 stars 107 forks source link

128GB mode Failed to Write Slice Error win11 only occurs on SSDs #443

Open BrandtH22 opened 7 months ago

BrandtH22 commented 7 months ago

This issue was reported by Delerium in discord: https://discord.com/channels/1034523881404370984/1102690350218354920/1179030557271793715

Hi guys - im using Win 11 on Bladebit 3.1 with 128GB ram and a 8GB Nvidia RTX 2080 and regardless of setting I keep getting "Failed to write slice on F://p1unsortedx-p11pairs-3lp-p3-lmap.tmp errror 0" - The temp drive is a local 1TB SSD.... what am I doing wrong?

Command used: bladebit_cuda -f xxxx -c xxxx -n 1 --compress 2 cudaplot --disk-128 -t1 F:/ F:/

Notes:

Full CLI of failed run:

Bladebit Chia Plotter
Version      : 3.1.0
Git Commit   : e9836f8bd963321457bc86eb5d61344bfb76dcf0
Compiled With: msvc 19.29.30152

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : f
 Pool contract address : f
 Compression Level     : 2
 Benchmark mode        : disabled

[Bladebit CUDA Plotter]
 Host RAM            : 127 GiB
 Plot checks         : disabled

Selected cuda device 0 : NVIDIA GeForce RTX 2080
 CUDA Compute Capability   : 7.5
 SM count                  : 46
 Max blocks per SM         : 16
 Max threads per SM        : 1024
 Async Engine Count        : 2
 L2 cache size             : 4.00 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 8.00 GB
  Free                     : 6.96 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 92412135120  bytes ( 88131.08  MiB or 86.07  GiB )
Intermediate RAM required : 4385218560   bytes ( 4182.07   MiB or 4.08   GiB )
Host RAM required         : 28420603904  bytes ( 27104.00  MiB or 26.47  GiB )
Total Host RAM required   : 120832739024 bytes ( 115235.08 MiB or 112.53 GiB )
GPU RAM required          : 6167756800   bytes ( 5882.03   MiB or 5.74   GiB )
Allocating buffers...
Done.

Generating plot 1 / 1: fbbb9cf468011ec5123479b0742f2dea31874c57a4f72d074a17b6b4ddc1be5d
Plot temporary file: F:/plotdone/plot-k32-c02-2023-11-28-20-36-fbbb9cf468011ec5123479b0742f2dea31874c57a4f72d074a17b6b4ddc1be5d.plot.tmp

Generating F1
Finished F1 in 12.17 seconds.
Table 2 completed in 37.99 seconds with 4294967296 entries.

Fatal Error:
Failed to write slice on 'F://p1unsortedx-p1lpairs-p3lp-p3-lmap.tmp' with error 0.

Full CLI of completed run using an HDD (attached): chialog.txt

GetStreamlined commented 7 months ago

I'm nice and active on this Harold-b so if you need me to test alternative settings to root cause this (or experimental releases) please just reach out. (This is the original raiser of the issue - Delerium on Discord).

harold-b commented 7 months ago

Thank you, @GetStreamlined Do you get the same issue w/ the SSD if you use --no-direct-io?

It's a global option that should come somewhere before cudaplot

GetStreamlined commented 7 months ago

@harold-b Sadly same issue. Command used:

bladebit_cuda -f redacted-c redacted -n 1 --compress 5 --no-direct-io cudaplot --disk-128 -t1 F:/ F:/

Output:

Bladebit Chia Plotter
Version      : 3.1.0
Git Commit   : e9836f8bd963321457bc86eb5d61344bfb76dcf0
Compiled With: msvc 19.29.30152

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : redacted
 Pool contract address : redacted
 Compression Level     : 5
 Benchmark mode        : disabled

[Bladebit CUDA Plotter]
 Host RAM            : 127 GiB
 Plot checks         : disabled

Selected cuda device 0 : NVIDIA GeForce RTX 2080
 CUDA Compute Capability   : 7.5
 SM count                  : 46
 Max blocks per SM         : 16
 Max threads per SM        : 1024
 Async Engine Count        : 2
 L2 cache size             : 4.00 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 8.00 GB
  Free                     : 6.96 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 92412135120  bytes ( 88131.08  MiB or 86.07  GiB )
Intermediate RAM required : 4385218560   bytes ( 4182.07   MiB or 4.08   GiB )
Host RAM required         : 28420603904  bytes ( 27104.00  MiB or 26.47  GiB )
Total Host RAM required   : 120832739024 bytes ( 115235.08 MiB or 112.53 GiB )
GPU RAM required          : 6167756800   bytes ( 5882.03   MiB or 5.74   GiB )
Allocating buffers...
Done.

Generating plot 1 / 1: bf98e067348b10a1c3e431deea13573e25606eb1a3a5404ac45cfcf004c1b101
Plot temporary file: F:/plot-k32-c05-2023-11-30-23-10-bf98e067348b10a1c3e431deea13573e25606eb1a3a5404ac45cfcf004c1b101.plot.tmp

Generating F1
Finished F1 in 12.39 seconds.
Table 2 completed in 36.91 seconds with 4294967296 entries.

Fatal Error:
Failed to write slice on 'F://p1unsortedx-p1lpairs-p3lp-p3-lmap.tmp' with error 0.
GetStreamlined commented 7 months ago

Also conducted an iotest:

C:\Chia\Chia_Plotting\Plotting>bladebit_cuda iotest F:/
Size   : 4096.00 MiB
Cache  : 0.00 MiB
Threads: 1
Passes : 1
Performing test with file F:/
Allocating buffer...

Writing...
Wrote 4096.00 MiB in 2.03 seconds @ 2016.74 MiB/s (1.97 GiB/s) or 2115 MB/s (2.11 GB/s).

Reading...
Read 4096.00 MiB in 1.49 seconds @ 2758.25 MiB/s (2.69 GiB/s) or 2892 MB/s (2.89 GB/s)
GetStreamlined commented 7 months ago

I also switched the video card to a NVIDIA GeForce GTX 1660 Ti to do more trouble shooting. Sadly same output.

Bladebit Chia Plotter
Version      : 3.1.0
Git Commit   : e9836f8bd963321457bc86eb5d61344bfb76dcf0
Compiled With: msvc 19.29.30152

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : xxxx
 Pool contract address : xxxx
 Compression Level     : 5
 Benchmark mode        : disabled

[Bladebit CUDA Plotter]
 Host RAM            : 127 GiB
 Plot checks         : disabled

Selected cuda device 0 : NVIDIA GeForce GTX 1660 Ti
 CUDA Compute Capability   : 7.5
 SM count                  : 24
 Max blocks per SM         : 16
 Max threads per SM        : 1024
 Async Engine Count        : 2
 L2 cache size             : 1.50 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 6.00 GB
  Free                     : 5.02 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 92412135120  bytes ( 88131.08  MiB or 86.07  GiB )
Intermediate RAM required : 4385218560   bytes ( 4182.07   MiB or 4.08   GiB )
Host RAM required         : 28420603904  bytes ( 27104.00  MiB or 26.47  GiB )
Total Host RAM required   : 120832739024 bytes ( 115235.08 MiB or 112.53 GiB )
GPU RAM required          : 6167756800   bytes ( 5882.03   MiB or 5.74   GiB )
Allocating buffers...
Done.

Generating plot 1 / 1: e49016c42914b4a4f527bdd2abaf6817e7f344acce768e4cd0a09e257c4c3ae0
Plot temporary file: F:/plot-k32-c05-2023-12-01-16-59-e49016c42914b4a4f527bdd2abaf6817e7f344acce768e4cd0a09e257c4c3ae0.plot.tmp

Generating F1
Finished F1 in 13.62 seconds.
Table 2 completed in 79.00 seconds with 4294938662 entries.

Fatal Error:
Failed to write slice on 'F://p1unsortedx-p1lpairs-p3lp-p3-lmap.tmp' with error 0.
teamwest93 commented 7 months ago

You run Terminal as Admin?

GetStreamlined commented 7 months ago

You run Terminal as Admin?

I did yes and also tried without.

GetStreamlined commented 7 months ago

Additionally tried in powershell (with and without administrator). Same issue.

teamwest93 commented 7 months ago

What abot beta1 or rc1 versions?

GetStreamlined commented 7 months ago

What abot beta1 or rc1 versions?

sadly they give a slightly different error (Failed to open plot file with error: 3)

GetStreamlined commented 6 months ago

@harold-b is there any update on this issue - im keen to get plotting as I dont want to go to Gigahorse.

harold-b commented 6 months ago

I wonder if this is related to block size. Would you mind running diskplot on those target SSDs to see what block size bladebit reports (you don't have to make a plot, it should just report the block size for the temp directories).

GetStreamlined commented 6 months ago

@harold-b

Here is the result of the diskplot using the SSD:

[Bladebit Disk Plotter]
 Heap size      : 3.37 GiB ( 3452.88 MiB )
 Cache size     : 0.00 GiB ( 0.00 MiB )
 Bucket count   : 256
 Alternating I/O: false
 F1  threads    : 16
 FP  threads    : 16
 C   threads    : 16
 P2  threads    : 16
 P3  threads    : 16
 I/O threads    : 1
 Temp1 block sz : 16384
 Temp2 block sz : 16384
 Temp1 path     : F:/
 Temp2 path     : F:/
 I/O metrices enabled.
 Allocating memory

If I used the HDD instead its different:

Temp1 block sz : 4096

harold-b commented 6 months ago

Thanks for the info! So it does look like it is block-size related. As a workaround for the time being you can try resetting the SSDs w/ 4k block size while this is resolved

GetStreamlined commented 6 months ago

Thanks for the info! So it does look like it is block-size related. As a workaround for the time being you can try resetting the SSDs w/ 4k block size while this is resolved

From research on Samsung Pro EVO SSD's you cannot change the block size so looks like I'm stuck waiting for a resolution :(

GetStreamlined commented 6 months ago

Hi @harold-b - hope you had a lovely Xmas and New Year. Do you have a rough timescale of when this will be resolved please?

James

haorldbchi commented 6 months ago

I've started up work on bladebit stuff this week. I don't have a timeframe but hopefully this one won't take much since we know exactly where the issue lies. I certainly haven't forgotten about you

GetStreamlined commented 6 months ago

@harold-b fantastic! If you need me to test a beta release let me know :)

sonosergio commented 3 months ago

I have the same problem ...

GetStreamlined commented 3 months ago

I have the same problem ...

I gave up waiting so I tried Gigahorse. No problem there.

piotr-nowicki commented 2 months ago

Same here. Probably it's better to switch to something else.