jehugaleahsa / FlatFiles

Reads and writes CSV, fixed-length and other flat file formats with a focus on schema definition, configuration and speed.
The Unlicense
357 stars 64 forks source link

Need for quoted strings in all cases #21

Closed JoaCHIP closed 7 years ago

JoaCHIP commented 7 years ago

I'm writing an exporter for a company who wants .csv files with FieldSeparator = "|" (pipe character) and Quote = "^". They want the quote character around all fields in all cases, even empty strings. Only null values should not have ^^ on each side. I realize this is a slightly odd format to ask for, but I think it still falls within the concept of a .csv file.

But because of the logic in the SeparatedValueRecordWriter.cs / needsEscaped( ) call, the Quote character is only inserted for non-empty strings in certain cases, so I guess a new switch would be needed. Something like AlwaysQuoteStrings = true

I wonder if FlatFiles could be / should be expanded to be able to do this? (I would have tried the change myself, but for some odd reason this solution causes VS 2017 to crash on load every time i try.)

jehugaleahsa commented 7 years ago

There are actually two changes needed for this: always quoting and distinguishing between null and empty strings.

Changing the code to always quote was pretty easy. I am still thinking about how to handle the second case. Suppose someone uses the default quoting, so they aren't "always quoting". In that case, null and an empty string are the same thing. In that sense, distinguishing between null and empty strings is really specific to "always quoting". It's like a configuration on top of a configuration, if you follow me.

Will let you know what I come up with.

On Mon, Sep 18, 2017 at 8:41 AM, JoaCHIP notifications@github.com wrote:

I'm writing an exporter for a company who wants .csv files with FieldSeparator = "|" (pipe character) and Quote = "^". They want the quote character around all fields in all cases, even empty strings. Only null values should not have ^^ on each side. I realize this is a slightly odd format to ask for, but I think it still falls within the concept of a .csv file.

But because of the logic in the SeparatedValueRecordWriter.cs / needsEscaped( ) call, the Quote character is only inserted for non-empty strings in certain cases, so I guess a new switch would be needed. Something like AlwaysQuoteStrings = true

I wonder if FlatFiles could be / should be expanded to be able to do this? (I would have tried the change myself, but for some odd reason this solution causes VS 2017 to crash on load every time i try.)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jehugaleahsa/FlatFiles/issues/21, or mute the thread https://github.com/notifications/unsubscribe-auth/ABTgPjrevQmLmw9c5QR_I37mnxikx1q9ks5sjmTqgaJpZM4Pa2Kr .

JoaCHIP commented 7 years ago

Good point. And thanks.

cbone commented 7 years ago

I have dug supper deeply into the code, but could you do something like this? Forgive me if I'm not accounting for something that would be revealed if I dug in a bit more.

private bool needsEscaped(string value, bool escapeEmptyStrings) { // Don't escape null or empty strings. if(value == null) return false; if(value == String.Empty) { return escapeEmptyStrings; } ..... ..... }

Something along those lines... Does this make sense and/or solve the issue?

jehugaleahsa commented 7 years ago

Yeah, that's where I'm looking to make the change. Good to know my code's not horribly cryptic. :-)

On Sep 18, 2017 3:36 PM, "Carl Bono" notifications@github.com wrote:

I have dug supper deeply into the code, but could you do something like this? Forgive me if I'm not accounting for something that would be revealed if I dug in a bit more.

private bool needsEscaped(string value, bool escapeEmptyStrings) { // Don't escape null or empty strings. if(value == null) return false; if(value == String.Empty) { return escapeEmptyStrings; } ..... ..... }

Something along those lines... Does this make sense and/or solve the issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jehugaleahsa/FlatFiles/issues/21#issuecomment-330332547, or mute the thread https://github.com/notifications/unsubscribe-auth/ABTgPsvEwJtLOQsbMyg_ZGGLhIEYz77lks5sjsYmgaJpZM4Pa2Kr .

cbone commented 7 years ago

The code is very good. Thanks for sharing it :+1:

Ha, and I meant I "HAVEN'T dug super deeply into the code.

jehugaleahsa commented 7 years ago

I ended up needing to upgrade to .NET Core 2.0, which shouldn't have any impact on you. Took way longer than I expected... my unit tests refused to be found in VS2017. After overthinking it for probably way too long I decided to simply discern between null and space as you had suggested in your example code above. You should find the new QuoteBehavior property on the SeparatedValueOptions class. You can set it to AlwaysQuote and you should see the behavior you're after. Let me know otherwise.