Open nh2 opened 1 month ago
Thank you Niklas for the details. I agree. I'll make the changes.
On Sep 30, 2024, 20:45, at 20:45, "Niklas Hambüchen" @.***> wrote:
The library uses
unsafe
FFI for the update function:This is rather wrong/unfortuante, because it can make the whole Haskell process freeze and halt all other Haskell threads, when it is given a large array (e.g. a few GB) to run on:
The GHC manual explains
GHC, since version 8.4, guarantees that garbage collection will never occur during an
unsafe
call.This means that if you give the (even pure!)
hash
function a multi-GBByteString
and it runs for e.g. 2 seconds, and a GC occurs meanwhile, all other Haskell threads will stop doing anything, including e.g. updating progress bars or answering unrelated requests in a web server.It means: Do a big hash, and your multi-threaded program fully pauses.
(Yes,
blake3
is really fast, so the string must be big for 2 seconds hash time, but this same fundamental issue applies to all "pure"unsafe
FFI calls and I just found it for this package first, so I'm reporting it here, and even if it's only 20 ms, that already destroys any web server's response time.)Production downtime story
I found this issue by accidentally exaggerating the problem, given data to
hash
that I obtained viammap
, which made it extra slow because involving disk I/O. (Of course it is not a problem of this library that I cheated kernel-side I/O into pure code withmmap
, but the problem still stands with normal pure large arrays.)Hashing a 100 GB file this way "disabled" my web server causing a half-hour production downtime/outage.
What this library should probably do
unsafe
is the correct solution for very short strings.- If an array given to
hash
/update
is longer than large N bytes, it probably should automatically make multiple smaller calls toc_update
in the high-level API (hash
andupdate
).
hash
already has a convenient interface to allow this, consuming[bin]
. Ifhash
isn't changed, andc_update
is keptunsafe
,hash
+update
+c_update
should at be given Haddocks that point out this pitfall.- In any case,
c_update
should document this.Takeaway independent of this library
unsafe
isn't only forbidden for blocking I/O -- it should also be forbidden for long-running CPU, because otherwise it will halt all other Haskell threads during the GC sync.
- This is already described in the GHC users guide: Foreign imports and multi-threading
- Turns out I posted about it 2 years ago already: https://old.reddit.com/r/haskell/comments/xlm4qv/haskell_ffi_call_safety_and_garbage_collection/ipko8lg/
Only use unsafe for pure CPU-bound FFI computations that take a couple nanoseconds
-- Reply to this email directly or view it on GitHub: https://github.com/k0001/hs-blake3/issues/5 You are receiving this because you are subscribed to this thread.
Message ID: @.***>
Thanks!
I guess the question of the day is, what size should be given to c_update
?
I assume that to get SIMD (e.g. AVX) benefits, we can't make it too small. 4 KiB? 32 KiB?
Maybe needs a benchmark (or re-use one from blake3 upstream).
what size should be given to
c_update
?
The BLAKE3 paper shows it already maximally effective for single-threaded operation at 16 KiB (on Intel Cascade Lake-SP), see Figure 3 of
https://github.com/BLAKE3-team/BLAKE3-specs/blob/ea51a3ac997288bf690ee82ac9cfc8b3e0e60f2a/blake3.pdf
I started an initiative for GHC to facilitate finding long-running unsafe
calls:
GHC #25333: Add RTS stats and alerts for long-running unsafe foreign function (FFI) calls
The library uses
unsafe
FFI for the update function:https://github.com/k0001/hs-blake3/blob/12884c59367a288e4952e7806fa08dee640a8653/blake3/lib/BLAKE3/IO.hs#L317-L319
This is rather wrong/unfortuante, because it can make the whole Haskell process freeze and halt all other Haskell threads, when it is given a large array (e.g. a few GB) to run on:
The GHC manual explains
This means that if you give the (even pure!)
hash
function a multi-GBByteString
and it runs for e.g. 2 seconds, and a GC occurs meanwhile, all other Haskell threads will stop doing anything, including e.g. updating progress bars or answering unrelated requests in a web server.It means: Do a big hash, and your multi-threaded program fully pauses.
(Yes,
blake3
is really fast, so the string must be big for 2 seconds hash time, but this same fundamental issue applies to all "pure"unsafe
FFI calls and I just found it for this package first, so I'm reporting it here, and even if it's only 20 ms, that already destroys any web server's response time.)Production downtime story
I found this issue by accidentally exaggerating the problem, given data to
hash
that I obtained viammap
, which made it extra slow because involving disk I/O. (Of course it is not a problem of this library that I cheated kernel-side I/O into pure code withmmap
, but the problem still stands with normal pure large arrays.)Hashing a 100 GB file this way "disabled" my web server causing a half-hour production downtime/outage.
What this library should probably do
unsafe
is the correct solution for very short strings.hash
/update
is longer than large N bytes, it probably should automatically make multiple smaller calls toc_update
in the high-level API (hash
andupdate
).hash
already has a convenient interface to allow this, consuming[bin]
. Ifhash
isn't changed, andc_update
is keptunsafe
,hash
+update
+c_update
should at be given Haddocks that point out this pitfall.c_update
should document this.Takeaway independent of this library
unsafe
isn't only forbidden for blocking I/O -- it should also be forbidden for long-running CPU, because otherwise it will halt all other Haskell threads during the GC sync.unsafe
calls: GHC #25333: Add RTS stats and alerts for long-running unsafe foreign function (FFI) calls