snoyberg / conduit

A streaming data library
903 stars 196 forks source link

Principled Treatment of sinkHandle etc. #412

Open m00ngoose opened 5 years ago

m00ngoose commented 5 years ago

It seems the variety of I/O functions exposed is a little ad-hoc.

Consider sinkHandle, it has an IO variant, a cautious variant, a Builder variant, a Flush variant, but only some combinations thereof. For instance, whilst experimenting with Conduit, I personally wanted

import qualified Data.ByteString.Builder as BSB
import qualified Data.ByteString.Builder.Extra as BSB

sinkFileBuilderFlush :: MonadResource m => FilePath -> ConduitT (Flush BSB.Builder) o m ()
sinkFileBuilderFlush fp = bracketP (openBinaryFile fp WriteMode) hClose sinkHandleBuilderFlush

sinkHandleBuilderFlush :: MonadIO m => Handle -> ConduitM (Flush BSB.Builder) o m ()
sinkHandleBuilderFlush h =
    awaitForever $ \mbs -> liftIO $
    case mbs of
      Chunk bs -> BSB.hPutBuilder h bs
      Flush -> BSB.hPutBuilder h BSB.flush

I'm not saying all possible combinations make sense, but given the symmetry between ByteString/Builder it's odd to have some without others? Perhaps candidates for conduit-extra Data.Conduit.Binary? sinkFileBuilder is another obvious one (given existence of withSinkFileBuilder).

snoyberg commented 5 years ago

Are you asking for specific helper functions to be added?

On Sun, Mar 24, 2019, 8:16 AM awhtayler notifications@github.com wrote:

It seems the variety of I/O functions exposed is a little ad-hoc.

Consider sinkHandle, it has an IO variant, a cautious variant, a Builder variant, a Flush variant, but only some combinations thereof. For instance, whilst experimenting with Conduit, I personally wanted

import qualified Data.ByteString.Builder as BSB import qualified Data.ByteString.Builder.Extra as BSB

sinkFileBuilderFlush :: MonadResource m => FilePath -> ConduitT (Flush BSB.Builder) o m () sinkFileBuilderFlush fp = bracketP (openBinaryFile fp WriteMode) hClose sinkHandleBuilderFlush

sinkHandleBuilderFlush :: MonadIO m => Handle -> ConduitM (Flush BSB.Builder) o m () sinkHandleBuilderFlush h = awaitForever $ \mbs -> liftIO $ case mbs of Chunk bs -> BSB.hPutBuilder h bs Flush -> BSB.hPutBuilder h BSB.flush

I'm not saying all possible combinations make sense, but given the symmetry between ByteString/Builder it's odd to have some without others? Perhaps candidates for conduit-extra Data.Conduit.Binary? sinkFileBuilder is another obvious one (given existence of withSinkFileBuilder).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/snoyberg/conduit/issues/412, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB4PiscKLDTEEOWjvyFdQ7dY17Evkks5vZxhAgaJpZM4cFVKe .

m00ngoose commented 5 years ago

Here's my thinking re. source:

Source IO can vary as follows:

I think having a function for every possible combination would leave the interface rather difficult to read. One common solution to this kind of problem is to have the common cases, then a function with several arguments which can be specialized across each dimension of possible behaviour.

The most flexible function currently provided is Data.Conduit.Binary sourceHandleRangeWithBuffer, which does generalize many of the other functions. However, in particular it doesn't give you safe/unsafe variance, and doesn't offer to accept an IO Handle instead.

1) Safety - I think it's better to create unsafe variance distinctly rather than giving an extra Bool parameter to indicate desired safe/unsafe behaviour. 2) IO-ness - sourceIOHandle and sourceFile actually differ from sourceHandle in that they also close the Handle opened, thus eg. sourceHandle h = sourceIOHandle (return h) would not work. Adding an extra Bool parameter indicating whether than handle should be closed is a possibility, but then documentation should be clear to avoid user errors in leaving handles unclosed. An alternative would be to return the created handle too. A third possibility is to create separate Handle vs IO Handle variants. 3) with variance - I'm personally not too fussed by these because I prefer the non-`with-prefixed versions.

My personal preference would be to add two functions, sourceIOHandleRangeWithBuffer[Unsafe] :: IO Handle -> Bool -> Maybe Int -> Maybe Int -> Int -> ConduitT i ByteString m (). If desired, all the other source_ functions could be implemented using these two functions, but more crucially, it exposes a wider range of functionality to users. For example, it would be easy to write sourceIOHandleUnsafe h = sourceIOHandleRangeWithBufferUnsafe h True (Just 0) Nothing defaultChunkSize, which would otherwise require looking at sourceIOHandle and sourceHandleUnsafe and combining them by hand.

snoyberg commented 5 years ago

That's a lot of changes, and I'm not convinced all of them are warranted. For example:

I'm not too terribly excited about trying to provide every variation, or even a maximally flexible base function, from the library. We address the most common cases, and throwing together a more manual implementation for specific corner cases isn't typically too difficult.