davidsantiago / clojure-csv

A library for reading and writing CSV files from Clojure
187 stars 35 forks source link

write-csv quoting of string containing quotes #12

Closed scottbale closed 12 years ago

scottbale commented 12 years ago

Is it expected that writing a value such as

(write-csv [[ "foo bar"]]) 

yields

"foo bar\n"

but writing a value containing a quoted portion such as

(write-csv [[ "\"foo\" bar"]])

yields

"\"\"\"foo\"\" bar\"\n"

? Having that whole string quoted with an extra set of double quotes is causing us to have an issue.

davidsantiago commented 12 years ago

Hi Scott, I believe this is correct. Your value with a quoted portion starts with the five bytes representing ",f,o,o,", but the " character has special meaning to the CSV format (it's used to designate quoted fields). The CSV spec says that a field containing the " character has to be quoted. In a CSV, that is done by surrounding the entire field in quotes, and then replacing any " in the field itself with "". The double double quotes signifies "just a single double quote, not trying to end the quoted field." So looking at what write-csv is yielding, it is adding a " to the beginning and end of the field, and then replacing the two " characters in the string itself with two " characters each. All are accounted for.

I realize this explanation might be confusing; not sure exactly what terminology to use for the different uses of quotes. What problem are you having because of this? I can try to suggest a solution.

David

scottbale commented 12 years ago

Thanks for the quick reply. The short answer is our output has to conform to a grammar, in this case the Turtle grammar http://www.w3.org/TeamSubmission/turtle/#sec-grammar, in which literals can be of the form

"foo"^^<http://example.org/my/datatype>

or

"""10"""^^xsd:decimal
scottbale commented 12 years ago

My apologies - just had a pow wow with my teammates and we realized everything's fine. As you said, it's correct CSV encoding, and we can successfully round-trip it. This issue can be closed. Thanks.