faisalman / simple-excel-js

SimpleExcel.js - WIP client-side script to parse / convert / write XML / CSV / TSV / HTML / JSON / etc formats.
http://faisalman.github.io/simple-excel-js/
191 stars 73 forks source link

Parsing CSV Issue: New lines within quotes not parsed correctly #18

Open stripathi669 opened 3 years ago

stripathi669 commented 3 years ago

Hi

I am facing an issue where if a cell entry is a multi-line entry, it is not being read / parsed correctly.

Consider a CSV entry like this:

Client Name,Display Name,Phone Number,Email,Notes
ACBDE,-,111,abc@gmail.com,"Part 1
Part 2
Part 3
part 4"

The correct parsing is:

Row 1 -> Client Name,Display Name,Phone Number,Email,Notes
Row 2 -> ABCDE,-,111,abc@gmail.com,"{multi line entry: Part1\nPart 2\nPart 3\n Part 4}"

But right now it is being parsed as:

Row 1 -> Client Name,Display Name,Phone Number,Email,Notes
Row 2: ABCDE,-,111,abc@gmail.com, Part 1
Row 3: Part 2
Row 4: Part 3 
Row 5: Part 4
stripathi669 commented 3 years ago

I suspect this is due to Regex.LINEBREAK which does not ignore linebreaks if they are within quotes.

Relevant discussion here: https://stackoverflow.com/questions/10407697/split-a-csv-string-by-line-skipping-newlines-contained-between-quotes