JoshClose / CsvHelper

Library to help reading and writing CSV files
http://joshclose.github.io/CsvHelper/
Other
4.65k stars 1.05k forks source link

Add TreatUnquotedEmptyFieldAsNull property in IReaderConfiguration #2190

Open fan130 opened 10 months ago

fan130 commented 10 months ago

Is your feature request related to a problem? Please describe. I have such csv that only have quote for string values, and it use empty string with no quote to represent null value. For example, one line is 1,"",2. In this line, the second field should be parsed as empty string because it has quote. Another line is 1,,2, whose second field should be parsed as null. Currently CsvHelper cannot distinguish between quoted empty field and unquoted empty field.

Describe the solution you'd like Add a property TreatUnquotedEmptyFieldAsNull in IReaderConfiguration. If it's true, when calling GetField of CsvParser, it should return null for unquoted empty field. Its default value can be false. I've looked at the source code. The change should be small. Just add a check in GetField that if field.Length == 0 and field.QuoteCount == 0 and config.TreatUnquotedEmptyFieldAsNull is true, GetField can simply return null.

Describe alternatives you've considered I've seen there is a related issue here: https://github.com/JoshClose/CsvHelper/issues/252. Seems it has been closed by supporting explicitly setting a null value. Unfortunately this solution does not work for me because the csv files are not generated by me so I cannot control the null value in the csv files. I know there is no official null value contract for csv, but I think using unquoted empty field to represent null should be a common practice. Spark also supports this, so I hope csvhelper can support this as well. Thanks