glamod / glamod_landQC

QC for land data
0 stars 0 forks source link

Fix bug in reading of `psv` files where quotes affect parsing #93

Closed rjhd2 closed 11 months ago

rjhd2 commented 11 months ago

Issue in latest set of mff files from Matt. Some were failing with weird errors - the wrong number of columns or EOFs occurring randomly.

Confirming with Matt that he thought the column number was corrct (238), doing some more digging with a speadsheet pointed towards the fact that quote marks (") in some fields were causing the lines to be misparsed.

rjhd2 commented 11 months ago

Using the quoting kwarg in pd.read_csv() corresponds to:

quoting : int or csv.QUOTE_* instance, default 0

    Control field quoting behavior per csv.QUOTE_* constants. Use one of QUOTE_MINIMAL (0), QUOTE_ALL (1),
    QUOTE_NONNUMERIC (2) or QUOTE_NONE (3).

And this means that there is no quoting at all, which seems to be appropriate.