michalmonday / CSV-Parser-for-Arduino

It turns CSV string into an associative array (like dict in python)
MIT License
57 stars 12 forks source link

Quote marks inside string breaking columns #26

Closed millab closed 7 months ago

millab commented 7 months ago

Hi! I'm just starting out with this library and running into an issue when parsing the following line when utilising cp.readSDfile()

2,,9.5,OFF,,,"After ""Bring out your dead""",,,,,

I would expect it to output as:

2 | | 9.5 | OFF | | | After "Bring out your dead" | | | | |

Instead it is parsing as:

2 | | 9.5 | OFF | | | After | Bring out your dead | | | |

This then breaks the remainder of the operation on later rows, offsetting columns.

For context, this file is saved from Excel, and is expected to be end-user-edited in excel before exporting to CSV to load on the ESP32. It is reasonable to assume the user will replicate this issue when writing spoken quotes in a cell.

The row in Excel shows as:

Screenshot 2024-02-03 012015

michalmonday commented 7 months ago

Hello, would it be possible to post here the exact file you used? And possibly the code?

I tried the following example and it worked as expected:

#include <CSV_Parser.h>

void setup() {
  Serial.begin(115200);
  delay(5000);

  char * csv_str = "to_ignore,my_strings,my_numbers\n"
                   ",hello,5\n"
                   ",world,10\n"
                   ",\"After \"\"Bring out your dead\"\"\",15\n";

  CSV_Parser cp(csv_str, /*format*/ "-sL");

  cp.print();
}

void loop() {
}

It printed:

CSV_Parser content:
rows_count = 3, cols_count = 3
   Header:
       -  | my_strings | my_numbers
   Types:
      - | char* | int32_t
   Values:
      - | hello | 5
      - | world | 10
      - | After "Bring out your dead" | 15
Memory occupied by values themselves = 46
sizeof(CSV_Parser) = 44

I also tried the same example without enclosing the string in quotes:

  char * csv_str = "to_ignore,my_strings,my_numbers\n"
                   ",hello,5\n"
                   ",world,10\n"
                   ",After \"Bring out your dead\",15\n";

and the output was the same, as expected.

millab commented 7 months ago

Thanks for taking a look at this! Using the example reading_from_sd_card sketch and this file: spamalotv5.csv

michalmonday commented 7 months ago

What format string did you use? I'm just trying to use the same scenario to recreate the issue.

millab commented 7 months ago

What format string did you use? I'm just trying to use the same scenario to recreate the issue.

I just set them all to strings to test:

CSV_Parser cp(/*col formats*/ "ssssssssssss", /*has_header*/ true, /*delimiter*/ ',');

michalmonday commented 7 months ago

Sorry about it but there is a bug when supplying text character by character, I will try to fix it.

michalmonday commented 7 months ago

I just created the 1.3.0 release where this issue should be fixed. It may take some time before it's available through Arduino library manager, but you could copy CSV_Parser.cpp directly from this repository and put it in the Arduino libraries "CSV_Parser" folder.

Btw this part was introduced in the CSV_Parser.cpp to address the issue:

    if (!whole_csv_supplied && !strpbrk(next_quote + 1, delim_chars)) {
      s = next_quote + 1;
      continue;
    }
    ending_quote_found = true;

Before that change, the code assumed the ending quote was found even if it wasn't followed by delimiter or "\n" or "\r". This check makes sure that's not the case.

Thank you for reporting this. I hope this change does not break anything else.

michalmonday commented 7 months ago

Btw that csv file appears to be large, if you run out of memory, you may consider parsing it row by row: https://github.com/michalmonday/CSV-Parser-for-Arduino/blob/master/examples/parsing_row_by_row_sd_card/parsing_row_by_row_sd_card.ino

millab commented 7 months ago

Thanks so much! I've just tested it and it is working perfectly for me now.