haskell-hvr / cassava

A CSV parsing and encoding library optimized for ease of use and high performance
http://hackage.haskell.org/package/cassava
BSD 3-Clause "New" or "Revised" License
222 stars 107 forks source link

Accumulating error messages for failed fields when decoding named records #231

Open cgeorgii opened 5 months ago

cgeorgii commented 5 months ago

I currently have code that accumulates errors across rows, but reports only one error (for instance a missing field) per row. Is it somehow possible to list all missing fields when decoding a named record?

Below the current code for reference, as well as how it behaves:

parseFromFile :: forall a. (FromNamedRecord a) => FilePath -> Char -> IO [Either String a]
parseFromFile filepath delimiter = withFile filepath ReadMode $ \csvFile -> do
  let options = defaultDecodeOptions {decDelimiter = fromIntegral (ord delimiter)}

      loopHeader :: HeaderParser (Parser a) -> IO [Either String a]
      loopHeader (FailH _ errMsg) = pure [Left errMsg]
      loopHeader (PartialH contP) = feed contP >>= loopHeader
      loopHeader (DoneH _header parser) = loop [] parser

      loop :: [Either String a] -> Parser a -> IO [Either String a]
      loop !acc (Fail _ errMsg) = pure $ acc ++ [Left errMsg]
      loop !acc (Many result contP) = feed contP >>= loop (acc ++ result)
      loop !acc (Done result) = pure (acc ++ result)

      feed :: (ByteString -> f p) -> IO (f p)
      feed cont = do
        isEof <- hIsEOF csvFile
        if isEof
          then pure $ cont empty
          else cont <$> hGetSome csvFile 4096

  loopHeader (Csv.Inc.decodeByNameWith options)

Running it against the following sample csv, yields the message as below:

# persons.csv
name,age,yearsOfExperience
John Doe,,
Smith,20,4

# result
[Left "in named field \"age\": expected Int, got \"\" (not enough input)",Right (Person {name = "Smith", age = 20, yearsOfExperience = 4})]

Ideally, the message would include the missing age and the missing yearsOfExperience fields. Is it possible to achieve it?

andreasabel commented 5 months ago

(If you are hoping to have directed this question to the maintainer of this package, please see #218.)

cgeorgii commented 5 months ago

So, essentially, cassava is unmaintained? Thank you for the answer anyway!