Open raehik opened 1 year ago
That's a great idea! I'll be happy to merge it. Do include the notes about your contribution in the PR as well. Also please provide some test coverage.
Awesome :) It will be a fairly big changeset that touches lots of the poking code. I might need some help confirming that certain bits are safe/sensible. I'll start a PR later
after a discussion with merijn on #haskell IRC I want to confirm the performance changes I saw -- since I might've been using an older version of ptr-poker and I use a potentially faster bytestring serializer (which could be used here too if safe). apparently it's a little surprising that there was such a difference by unboxing Ptrs and IO
I need more benchmarks to figure out what's going on. By replacing withForeignPtr
with unsafeWithForeignPtr
for serializing bytestrings, all the https://github.com/haskell-perf/strict-bytestring-builders benchmarks improve tremendously. In context:
poke :: ByteString -> Ptr Word8 -> IO (Ptr Word8)
poke (BS fptr length) ptr =
{-# SCC "poke" #-}
unsafeWithForeignPtr fptr $ \ bytesPtr ->
memcpy ptr bytesPtr length $>
plusPtr ptr length
fumieval's mason uses it here (implementation copied from GHC for compat) in the same way, exclusively for bytestring serialization.
However, swapping to withForeignPtr
in my library and running a generics-based benchmark changes absolutely nothing. So I'm currently unsure where the performance improvement is being introduced.
Benching serializing ~5kb of bytestrings on my lib vs. the latest commit on this repo still says my lib is faster. The only thing that should be happening there is bytestring serializing (identical) and generics -- where the code is identical, so it would appear to come from the semigroup instance.
So it would appear this is worthwhile. It's a shame the benchmarks are so all over the place.
Edit: Wait, no! This time unsafeWithForeignPtr did give a massive improvement! OK, so it's dependent on GHC's mood (maybe it can optimize better when unboxed). Now the improvement is only 10% with unboxed. Sorry for spam. I'll make a PR for unsafeWithForeignPtr
and go from there.
Partial followup to https://github.com/haskell-perf/strict-bytestring-builders/pull/6 .
ptr-poker uses the following type for low-level pokes:
I believe unboxing would improve performance:
Ptr
s aredata
boxed, so this should remove some indirection. We have to unbox IO because it's not levity-polymorphic.This representation gave me consistent better performance both over in https://github.com/haskell-perf/strict-bytestring-builders and in a less synthetic benchmark here, serializing lots of
Word8
s andByteString
s. Main code is currently here. (Some tests in that repo also assert basic soundness.)What do you think? If you found it appealing I would gladly merge my code in here. Otherwise I'd like to publish my lib on Hackage with attribution for the idea.