dashbitco / nimble_csv

A simple and fast CSV parsing and dumping library for Elixir
https://hexdocs.pm/nimble_csv
771 stars 51 forks source link

Problem with comma when get a CSV from some api's #9

Closed mbenatti closed 7 years ago

mbenatti commented 7 years ago

The API used to load the csv: https://github.com/GrandCru/GoogleSheets

Problem: When a field have a comma inside like "description, is, this!" The google Sheets Api load the field as "\"description,is,this!\"", as the Standard implementation of CSV Escape is \", this is a problem when loading and parse using the sheets api Elixir, because the parser truncate with comma inside the field

How to test: load a google spreadsheet, the fields need to have comma inside, use the google_sheets Elixir Lib to do this and nimble_csv.

This can be fixed changing the escape character when download the CSV from google API, but I can't figure out to change from comma to other escape in API.

Link to download from api is like this: https://docs.google.com/spreadsheets/d//export?gid=0&format=csv

josevalim commented 7 years ago

@mbenatti can you please provide a CSV file or a short example that reproduces this? I am aware you mentioned a couple steps to get a possible file from the Google API but I would prefer to focus on a fixture file than trying to reproduce a bug through the API.

FWIW, I have introduced this test to the test suite and it passes:

assert CSV.parse_string("""
    name,year
    "doe, john",1986
    "jane, mary",1985
    """) == [["doe, john", "1986"], ["jane, mary", "1985"]]

If you have a similar fixture, I would love to take a look at it.

mbenatti commented 7 years ago

@josevalim Yes, I make a repository for this, https://github.com/mbenatti/csv_sheets_bug

You just need to see the file lib/csv_sheets_bug.ex

"To run just enter in "iex -S mix" mode then run: CSVSheetsBug.find_a_bug"

josevalim commented 7 years ago

@mbenatti you need to change this line:

https://github.com/mbenatti/csv_sheets_bug/blob/master/lib/csv_sheets_bug.ex#L19

to be:

CSV.define(MyParser, separator: "\t", escape: "\\")

Since you want a blackslash for escaping and not a quote. Note that you want to define the parser at the top of the file and not inside of a function.