hougaard / Youtube-Video-Sources

All the source code from my Youtube videos
https://youtube.com/c/ErikHougaard
167 stars 101 forks source link

CSV #2

Open Grueslayer opened 3 years ago

Grueslayer commented 3 years ago

Hi Eric, "CSV Buffer" is really a bad implementation in BC.

You just "removed" the double quotes from the text value (that's also implemented by MS with a third parameter).

That is totally nonsense: The double quote is used as Text Qualifier. That means it escapes somewhat the value for the cell.

Let's assume you've ABC, DEF, GHI and three cells in one row, using an English Excel to export CSV, it'll mostly look like

ABC,DEF,GHI

because English Excel uses comma as delimiter by default. Now just add a comma to your Text ABC, D,E,F, GHI

ABC,"D,E,F",GHI

Still three cells in Excel, the Text in CSV gets surrounded by double quotes. That means if an cell starts with double quotes you need to process for the cell value until you get another double quote. That is NOT (at least in BC18) implemented by the ReadLines procedure. It will split the above to 5 cells!!!

Furthermore a double doublequote will escape one in that context (mostly used as inch shortcut) ABC, 5" tube, GHI will be

ABC,"5"" tube",GHI

Also if you enter a multiline text in a cell that method is used, now you've a linebreak within one row:

A1,B1,C1
A2,"B2Line1
B2Line2
B2Line3",C2
A3,B3;C3

It is only usable for very simple CSV files.

I've written a new procedure to parse the stream char by char in a small state machine otherwise you really need to make sure that the input CSV is not using the seperator in its cell values and also must not escape the double quotes.

hougaard commented 3 years ago

.. not disagreeing :)