ozataman / csv-conduit

Flexible, fast and constant-space CSV library for Haskell using conduits
Other
52 stars 32 forks source link

writeHeaders in a stream with a custom type #34

Open dmvianna opened 6 years ago

dmvianna commented 6 years ago

I have a custom type and a parser for it.

data Abstracts = Abstracts
  { _id       :: !Integer
  , _abstract :: !AuAddress
  } deriving (Show, Eq)

instance FromField AuAddress where
  parseField s =
    case parseByteString (step auAddress)
         mempty (foldCase s) of
      Success x -> pure x
      Failure e -> mzero

instance FromNamedRecord Abstracts where
  parseNamedRecord m =
    Abstracts <$>
    m .: "AUSTRALIAN_APPL_NO" <*>
    m .: "ABSTRACT_TEXT"

I can parse it fine with csv-conduit


module Main where

import           Data.Conduit
import           Data.Conduit.Binary
import           Data.CSV.Conduit
import           Data.CSV.Conduit.Conversion

import           Addresses
import           Instances

csvset :: Char -> CSVSettings
csvset c = CSVSettings {csvSep = c, csvQuoteChar = Just '"'}

file :: FilePath
file = "./data/pat_abstracts.csv"

process :: Monad m => Conduit (Named Abstracts) m (Named Abstracts)
process = awaitForever $ yield

main :: IO ()
main = runResourceT $
  sourceFile file .|
  intoCSV (csvset ',') .|
  process .|
  fromCSV (csvset ',') $$
  sinkFile "./data/pat_output.csv"

However writeHeaders' type does not fit in this context.

main :: IO ()
main = runResourceT $
  sourceFile file .|
  intoCSV (csvset ',') .|
  process .|
  (writeHeaders (csvset ',') >> fromCSV (csvset ',')) $$
  sinkFile "./data/pat_output.csv"
[-Wdeferred-type-errors]
    * Couldn't match type `containers-0.5.10.2:Data.Map.Internal.Map
                             r0 r0'
                     with `Named Abstracts'
      Expected type: ConduitM
                       (Named Abstracts)
                       Data.ByteString.Internal.ByteString
                       (resourcet-1.1.10:Control.Monad.Trans.Resource.Internal.ResourceT
                          IO)
                       ()
        Actual type: ConduitM
                       (MapRow r0)
                       Data.ByteString.Internal.ByteString
                       (resourcet-1.1.10:Control.Monad.Trans.Resource.Internal.ResourceT
                          IO)
                       ()
    * In the second argument of `(.|)', namely
        `(writeHeaders (csvset ',') >> fromCSV (csvset ','))'
      In the second argument of `(.|)', namely
        `process .| (writeHeaders (csvset ',') >> fromCSV (csvset ','))'
      In the second argument of `(.|)', namely
        `intoCSV (csvset ',')
           .| process .| (writeHeaders (csvset ',') >> fromCSV (csvset ','))'

I'm trying unsuccessfully to figure out how this would be achieved with a custom type. Obviously the headers are just plain text, and I should not parse them with my parser. But I need to set them before I start setting stuff that was parsed.

The complete code for my project is in my conduit-patents repo.