snoyberg / tar-conduit

Conduit based tar extraction mechanism
MIT License
8 stars 9 forks source link

Simpler APIs #2

Closed bartavelle closed 7 years ago

bartavelle commented 7 years ago

I think there should be a simpler (but more wasteful) API by default before release, something like, that gives a stream of full files:

tarfiles :: Foo => Conduit ByteString m (Header, ByteString)

What do you think of that ?

bartavelle commented 7 years ago

Perhaps even something like:

tarFiles :: Foo => Conduit ByteString m (FileType, FilePath, ByteString)

Of course, the short bytestring to String conversion could be troublesome ...

snoyberg commented 7 years ago

I'm not in favor of encouraging an API which is non streaming, it's too easy to mislead people into writing broken code.

On Fri, Oct 28, 2016, 11:49 AM Simon Marechal notifications@github.com wrote:

Perhaps even something like:

tarFiles :: Foo => Conduit ByteString m (FileType, FilePath, ByteString)

Of course, the short bytestring to String conversion could be troublesome ...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/snoyberg/tar-conduit/issues/2#issuecomment-256869000, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB4QmAIFIwMY0Vx4nyvacoe3daIgaks5q4bb7gaJpZM4KjMGR .

bartavelle commented 7 years ago

How is that non-streaming ? I probably wrote broken code then! I though something like that would do the trick:

withEntries  (\hdr -> CL.consume >>= yield . (hdr,) . mconcat )

It's still streaming, only with bigger chunks, or am I mistaken ? My use case is tarballs containing lots of small files, so this makes a lot of sense to me.

snoyberg commented 7 years ago

It's not streaming in the sense that it will use unbounded memory given large files. You'd be susceptible to an OOM attack if you received a tarball with large files in it.

On Fri, Oct 28, 2016, 12:45 PM Simon Marechal notifications@github.com wrote:

How is that non-streaming ? I probably wrote broken code then! I though something like that would do the trick:

withEntries (\hdr -> CL.consume >>= yield . (hdr,) . mconcat )

It's still streaming, only with bigger chunks, or am I mistaken ? My use case is tarballs containing lots of small files, so this makes a lot of sense to me.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/snoyberg/tar-conduit/issues/2#issuecomment-256880199, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB5VQXhKIPqevxGpXB2mraHSgyC9Qks5q4cREgaJpZM4KjMGR .

bartavelle commented 7 years ago

Yes, that is certain. The processes that do this have appropriate bounds on the amount of memory they are allowed to consume.

What about just documenting how to do this with the appropriate warnings ? I really believe this is a common use case. Perhaps a function that does truncate large files, with a user supplied size limit, could be helpful ?

snoyberg commented 7 years ago

I'd rather document doing things the right way. This really is encouraging a bad practice.

On Fri, Oct 28, 2016, 12:54 PM Simon Marechal notifications@github.com wrote:

Yes, that is certain. The process that do this have appropriate bounds on the amount of memory they are allowed to consume.

What about just documenting how to do this with the appropriate warnings ? I really believe this is a common use case. Perhaps a function that does truncate large files, with a user supplied size limit, could be helpful ?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/snoyberg/tar-conduit/issues/2#issuecomment-256881741, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB-_-5A2Qih3enY-bcF-mZDUCMKwzks5q4cYqgaJpZM4KjMGR .

bartavelle commented 7 years ago

Alright!