dashbitco / nimble_csv

A simple and fast CSV parsing and dumping library for Elixir
https://hexdocs.pm/nimble_csv
767 stars 51 forks source link

Escaping double quotes within text. #39

Closed Ajwah closed 5 years ago

Ajwah commented 5 years ago
NimbleCSV.define(MyParser, separator: "|", escape: "\"")
MyParser.parse_string "name|age\njohn|27\nsay: \"Hello\"|32"

Results in:

** (NimbleCSV.ParseError) unexpected escape character " in "say: \"Hello\"|32"
    deps/nimble_csv/lib/nimble_csv.ex:348: MyParser.separator/5
    deps/nimble_csv/lib/nimble_csv.ex:281: anonymous fn/4 in MyParser.parse_enumerable/2

Whereas expected result would be: [["john", "27"], ["say: \"Hello\"", "32"]]

josevalim commented 5 years ago

CSV requires the whole column to be escaped. You can't just quote in the middle of the column. So you should either disable " as the escape or properly escape it according to the CSV rules, like this:

MyParser.parse_string ~s(name|age\njohn|27\n"say: ""Hello"""|32)
Ajwah commented 5 years ago

I tried not setting the escape character as you suggested:

NimbleCSV.define(MyParser, separator: "|")
MyParser.parse_string "name|age\njohn|27\nsay: \"Hello\"|32"

will still result in the same issue. If I set it to a nonsensical character instead, then it works just fine:

NimbleCSV.define(MyParser, separator: "|", escape: ">>>")
MyParser.parse_string "name|age\njohn|27\nsay: \"Hello\"|32"
[["john", "27"], ["say: \"Hello\"", "32"]]
josevalim commented 5 years ago

Yes, because the default escape character is the double quote. You can set it to the null byte too: “\0”. --

José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D