fpco / streaming-commons

Common lower-level functions needed by various streaming data libraries
MIT License
36 stars 41 forks source link

gzip decompression stops early without error #37

Closed dimbleby closed 7 years ago

dimbleby commented 7 years ago

Originally raised against pipes-zlib as k0001/pipes-zlib#16, but the following program displays the same fault without pipes being involved:

#!/usr/bin/env stack
{- stack
   --resolver lts-7.23
   runghc
   --package streaming-commons
   --package turtle
 -}

{-# LANGUAGE OverloadedStrings #-}

import qualified Codec.Compression.GZip as GZip
import Control.Exception (throwIO)
import qualified Data.ByteString.Lazy as L
import Control.Monad (foldM)
import Data.Streaming.Zlib
import Data.Text (unpack)

import Turtle

decompress :: L.ByteString -> IO L.ByteString
decompress gzipped = do
    inf <- initInflate $ WindowBits 31
    ungzipped <- foldM (go' inf) id $ L.toChunks gzipped
    final <- finishInflate inf
    return $ L.fromChunks $ ungzipped [final]
  where
    go' inf front bs = feedInflate inf bs >>= go front
    go front x = do
        y <- x
        case y of
            PRDone -> return front
            PRNext z -> go (front . (:) z) x
            PRError e -> throwIO e

main = do
    fn <- options "GZip decompression test script" (argPath "file" "file to decompress")

    -- lazy IO solution, works!
    -- fmap GZip.decompress (L.readFile . unpack . format fp $ fn) >>= L.putStr

    -- Streaming solution, stops early without error!
    gzipped <- L.readFile . unpack . format fp $ fn
    decompress gzipped >>= L.putStr

... where my decompress is based on decompressRaw, and the test input file that reproduces the problem is, per the original issue, available here.

dimbleby commented 7 years ago

Ah, I wonder if this is basically the same as #28, and snoyberg/conduit#254

snoyberg commented 7 years ago

I don't know how you created the original file, but that sounds reasonable.

On Tue, May 23, 2017 at 1:49 AM, David Hotham notifications@github.com wrote:

Ah, I wonder if this is basically the same as #28 https://github.com/fpco/streaming-commons/issues/28, and snoyberg/conduit#254 https://github.com/snoyberg/conduit/issues/254

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fpco/streaming-commons/issues/37#issuecomment-303317867, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB83NaxccrxO_K0DCfC5Gnh_JceLEks5r8o-NgaJpZM4Ni7L- .

dimbleby commented 7 years ago

yep, I reckon that's what's going on. Sorry for the noise...

snoyberg commented 7 years ago

No worries

On Wed, May 24, 2017, 1:48 PM David Hotham notifications@github.com wrote:

Closed #37 https://github.com/fpco/streaming-commons/issues/37.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/fpco/streaming-commons/issues/37#event-1096238152, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBBz3N8d1uhqkRZX0T4tgxxFIdV2yvks5r9IljgaJpZM4Ni7L- .