I've attempted to parse the file with this Reader:
Attempting to parse with that reader: I see the line breaks at each "":
These double quotes ("") are interpreted as escaping to create a single quote " in a field. This CSV opens fine in Excel/Numbers, but when processed by the Importer, it detects the "" after 4-1/8 and 9-1/2 as field separators.
I've attempted to create regex-based parser to capture this second field, but I've been unsuccessful in capturing the field without including the quotes in the value named group. For example, I tried this expression: (?<value>((?:").*?(?:"))|(.*?))(?:[,]|\Z)(?<rest>(.*|\Z)), and while it still has the issue, it also includes the quote characters at the start and end of value. When I tried to specify (?<value>) twice, the parsing failed to produce any output at all, so I'm guessing that's invalid syntax.
This regex effectively captures the double-quotes ((?<value>((?:")([^"]|"")*?(?:"))|(.*?))(?:[,]|\Z)(?<rest>(.*|\Z))), but it has two problems: the quotes are still included at the start and end and the double quotes appear as double-quotes.
Is there any way to configure the importer to read a CSV with a line like above similar to how Numbers or Excel do?
Consider a CSV with this line:
This file is generated from an export of Amazon transactions.
I've attempted to parse the file with this Reader:
Attempting to parse with that reader: I see the line breaks at each
""
:These double quotes (
""
) are interpreted as escaping to create a single quote"
in a field. This CSV opens fine in Excel/Numbers, but when processed by the Importer, it detects the""
after4-1/8
and9-1/2
as field separators.I've attempted to create regex-based parser to capture this second field, but I've been unsuccessful in capturing the field without including the quotes in the
value
named group. For example, I tried this expression:(?<value>((?:").*?(?:"))|(.*?))(?:[,]|\Z)(?<rest>(.*|\Z))
, and while it still has the issue, it also includes the quote characters at the start and end ofvalue
. When I tried to specify(?<value>)
twice, the parsing failed to produce any output at all, so I'm guessing that's invalid syntax.This regex effectively captures the double-quotes (
(?<value>((?:")([^"]|"")*?(?:"))|(.*?))(?:[,]|\Z)(?<rest>(.*|\Z))
), but it has two problems: the quotes are still included at the start and end and the double quotes appear as double-quotes.Is there any way to configure the importer to read a CSV with a line like above similar to how Numbers or Excel do?