Closed dimitri closed 10 years ago
So the bug as I understand it is the need to add an escape for the escape character (in some circumstances).
By default this should be "\" (ie: two backslashes in a row). Any suggestion for the name? escape-escape
sounds awful but also accurate.
Well in the case of that specific input file you can see https://github.com/HTTPArchive/httparchive/issues/25 that hints into the backslash not being there for any reason really (truncated string).
So I'm not sure we should reason in terms of escaping the escape character rather than just allowing for a general espace character: backslash could be used to escape whatever follows, which in the case of the faulty input we have, is another backslash, and then we have a free quote, so the quoted section ends. What do you think?
I read that as:
We would like a new parser escaping mode, that rather than replacing all quote-escapes
with a quote
, replaces {escape-character}{thing}
with {thing}
regardless of what {thing}
is.
I have to imagine that this is partly where the ""
escape sequence arose.
I guess a new parameter :escaping-mode
that defaults to :quote
and accepts one of (:quote :following-char)
.
Please try this out and let me know if it matches what you had in mind / solves your parsing error.
Hi,
As reported in https://github.com/dimitri/pgloader/issues/80 towards the end, cl-csv fails to parse simple input when it contains unexpected escaping characters (not the whole escaping string) in the middle of a text field.
Here's a reduced test case:
And I can reproduce the failure with the following code: