chadaustin / buffer-builder

Haskell library for efficiently building up buffers
BSD 3-Clause "New" or "Revised" License
26 stars 8 forks source link

Would it be possible to view current buffer usage? #8

Closed iand675 closed 9 years ago

iand675 commented 9 years ago

I've got some code where the buffered value needs to exceed 1MB before being flushed over the network, and I'm currently having to do my own bookkeeping of the size. I'm pretty sure that the builder tracks its utilization under the hood, so would it be realistic to make that data publicly accessible?

chadaustin commented 9 years ago

Hi @iand675, good suggestion. I uploaded buffer-builder 0.2.3.0 to hackage with the ability to return non-unit from BufferBuilder and add the ability to read the current length with "currentLength".

Does that help?

For the details, see https://github.com/chadaustin/buffer-builder/commit/c93802dea07b2d7d50bc3b08d30db9db17bf4312

chadaustin commented 9 years ago

Closing, re-open if this doesn't solve your problem.

iand675 commented 9 years ago

Hi Chad, thanks for being so responsive! I'm not sure if this is as useful for my use case as it would be if the current length was queryable from outside of the monadic context– something more like currentLength :: Builder a -> Int. For context, here's the snippet of code that I'm working with:

objectConsumer :: MonadIO m => Env -> Text -> Text -> (CreateMultiPartUpload -> CreateMultiPartUpload) -> Consumer ByteString m ()
objectConsumer env bucket key f = do
  resp <- liftIO $ runResourceT $ Network.AWS.send env $
            f $ createMultiPartUpload bucket key
  case resp of
    Left err -> liftIO $ print err
    Right r -> go r (0, return ())
  where
    bufferSize = 4194304 -- 4 megabytes
    go :: CreateMultiPartUploadResponse -> (Int, Builder ()) -> Consumer ByteString m ()
    go mup (size, buff) = do
      chunk <- Pipes.await
      let newState@(size', buff') = (size + BS.length chunk, buff >> appendBS chunk)
      if size' >= bufferSize
        then do
          liftIO $ runResourceT $ Network.AWS.send env $
            __buildUpload $ runBufferBuilderWithOptions (Options bufferSize False) buff'
          go mpu (0, return ())
        else go mpu newState

So the gist is that I'm receiving chunks of the 4MB object that I'm building up via a streaming abstraction, and I can track the state in a tuple. I don't want to have to run the builder just to get the size, and I can't short-circuit from within the builder since it's not really involved in the control flow of the consumer.

iand675 commented 9 years ago

@chadaustin Incidentally, I don't appear to be able to reopen the issue.

chadaustin commented 9 years ago

Hmmm. So, a BufferBuilder value has no state. It is just a recipe for building a buffer from various things. There is no inherently-tracked length until the buffer is actually built with runBufferBuilder.

Sounds like what you want is similar to #7, where @jberryman requested the ability to calculate the length of a BufferBuilder without actually running it. This would be fairly inexpensive, but not as cheap as maintaining your own size as you construct the BufferBuilder object. But I can add the functionality and you can benchmark to see if it's a win if you want. :)

iand675 commented 9 years ago

Ah, nevermind then. Thanks for the help :)

chadaustin commented 9 years ago

FYI, I implemented the ability to count the length of a BufferBuilder without actually writing bytes. May not help your use case, but it might be worth testing. :) https://github.com/chadaustin/buffer-builder/commit/c62bac3622b81fa2f8d05b356f24b0fcf439e341