Bodigrim / linear-builder

Strict Text and ByteString builder, which hides mutable buffer behind linear types and takes amortized linear time.
https://hackage.haskell.org/package/text-builder-linear
BSD 3-Clause "New" or "Revised" License
88 stars 4 forks source link

Add bytestring -> hex ASCII builder #23

Closed raehik closed 1 month ago

raehik commented 1 month ago

i.e. appendHex{Lower,Upper}ByteString :: ByteString -> Buffer %1 -> Buffer. This is useful for printing such data as hash digests. (I want it for pretty-printing raw bytes in a JSON.) There is already some relevant code for printing bounded integrals using their hex representation.

I have already written this, for the most part. But it's fairly long (I chunk IO accesses for performance), and has inconvenient code (requires unaligned reads/writes, uses index# :: ... primops I don't well understand, needs a more general ST-based withForeignPtr :: ForeignPtr a -> (Ptr a -> ST s b) -> ST s b). Is it the sort of thing you would want in this core library?

raehik commented 1 month ago

Work-in-progress code here: https://github.com/raehik/binrep/pull/11/files#diff-d7b5833644a1ec4b07de79c5e2e2f9416745d85e44c82493e8c3d229fe7271edR25

My printer adds spaces between bytes, which complicates chunking. A non-spaced version would be shorter and trivial to write.

raehik commented 1 month ago

Assuming this would theoretically be accepted, I think the unaligned primops are the main blocker. I'm using my compat layer that shims the unaligned primops to older GHCs without doing anything (so pre-GHC 9.10, it's only sound on platforms which permit unaligned accesses). If we access byte-by-byte we don't need them, but then this would be adding little value.

Bodigrim commented 1 month ago

Is it the sort of thing you would want in this core library?

Sorry, I have limited bandwidth at the moment. I'm not sure; if we add base16 to the core library, one could reasonably ask for base32 / base62 / base64 / base85, at which point it will get very unwieldy. I'm certainly open to expose / design / write more helpers to make life easier for third-party libraries though.

raehik commented 1 month ago

if we add base16 to the core library, one could reasonably ask for base32 / base62 / base64 / base85, at which point it will get very unwieldy

That's true. Just this would be a fairly large addition. I'll close this.

One thing that would be very useful: I want to know if my ST withForeignPtr is sound, and sound to use with ByteStrings:

withForeignPtr :: forall a b s. ForeignPtr a -> (Ptr a -> ST s b) -> ST s b
withForeignPtr fo@(ForeignPtr _ r) f = ST $ \s ->
  case f (unsafeForeignPtrToPtr fo) of
    ST action# -> keepAlive# r s action#

Might you know, from a cursory glance? It appears required for efficiently consuming bytestrings using the existing helpers (appendExact in my case). If it is sound, it should probably be in base.

Bodigrim commented 1 month ago

I'm afraid I don't have a good intuition about unsafeForeignPtrToPtr and keepAlive#. Is https://www.haskell.org/ghc/blog/20210607-the-keepAlive-story.html of any help?

@bgamari @clyring could you possible opine?

bgamari commented 1 month ago

@raehik, your implementation looks good to me. In general keepAlive# is sound (that is, keeps its first argument alive until the end of the continuation given as its last argument). It is the older touch# primop that was one needs to be extremely careful with; I would avoid the latter unless absolutely necessary.

raehik commented 1 month ago

Thank you @bgamari ! I will make a PR on bytestring for withForeignPtrST (edit: or perhaps it's a GHC change)