ozataman / csv-conduit

Flexible, fast and constant-space CSV library for Haskell using conduits
Other
52 stars 33 forks source link

Less aggressive quoting #35

Open mightybyte opened 6 years ago

mightybyte commented 6 years ago

Currently csv-conduit outputs the string "" for empty fields. Postgres throws the following error when it encounters this for fields of type double precision:

ERROR:  invalid input syntax for type double precision: ""

So while the current behavior is correct according to the spec, it seems to be less broadly supported in practice. Also, if you're using csv-conduit to transform large files, the current behavior means that every single field will be quoted. This means that you're outputting two additional bytes per field, making the resulting files noticeably larger than they need to be.

This PR only quotes fields if they contain the quote character, which is correct behavior according to the spec.

MichaelXavier commented 6 years ago

I don't know if I can justify changing the behavior for all users here. I think I'd prefer adding a flag to CSVSettings, something like data OutputQuoting = AlwaysQuote | QuoteWhenNeeded and have the default continue to be AlwaysQuote.

mightybyte commented 6 years ago

Ahh yes, good idea. I'll try to get to it when I have some free time.