Bodigrim / linear-builder

Strict Text and ByteString builder, which hides mutable buffer behind linear types and takes amortized linear time.
https://hackage.haskell.org/package/text-builder-linear
BSD 3-Clause "New" or "Revised" License
88 stars 4 forks source link

Add an API for the ByteString builder #22

Open wismill opened 4 months ago

wismill commented 4 months ago

I just realized that this package uses a ByteString builder to implement (|>%) and (%<|).

I did play a bit with them in #20, but felt uncomfortable with unsafePerformIO, etc. I then tried with Data.ByteString.Lazy.foldlChunks, which obviously would not lead to great perf, giving we need to allocate intermediate chunks and then copy them.

Should we provide an API for the ByteString builder? It would unlock quite a lot of features.

Bodigrim commented 4 months ago

What kind of API are we talking about?

wismill commented 4 months ago

The simplest. Something like:

prependByteStringBuilder :: Int -> BB.Builder -> Buffer %1 -> Buffer
appendByteStringBuilder :: Buffer %1 -> Int -> BB.Builder -> Buffer

where the Int is the max size. Other possibilities:


Reading the Double code more carefully, I realized we still allocate a buffer with BBI.newBuffer. If I understood correctly, there is no way to escape this safely because of the GC, except if our Buffer uses a pinned array.

Bodigrim commented 4 months ago

I'm not sure that tying us that close to internal details of BB.Builder is a good idea. They can change and they do not quite spell out how to use them safely. Unless you are microoptimizing and cutting constant overhead (which is the case for Double builder), you can just run BB.Builder to get ByteString and append it to our builder with fromAddr. For builders doing any significant amount of work the overhead would be negligible.

wismill commented 4 months ago

You mean something like:

appendBSBuilder :: BB.Builder -> Buffer %1 -> Buffer
appendBSBuilder builder = unsafeDupablePerformIO
  (B.unsafeUseAsCString
    (BL.toStrict (BB.toLazyByteString builder))
    (\(Ptr addr) -> pure (|># addr)))

appendBSBuilder' :: Buffer %1 -> BB.Builder -> Buffer
appendBSBuilder' buf builder = foldlIntoBuffer
  (\b bs -> unsafeDupablePerformIO (B.unsafeUseAsCString bs (\(Ptr addr) -> pure (|># addr))) b) 
  buf
  (BL.toChunks (BB.toLazyByteString builder))

This does not look trivial. If we do not provide an API, it would be a good idea to document the best method to implement it.

Bodigrim commented 4 months ago

Yeah, something like this. Documenting it or maybe even providing helpers similar to ones you just wrote would be nice.