uniVocity-parsers is a suite of extremely fast and reliable parsers for Java. It provides a consistent interface for handling different file formats, and a solid framework for the development of new parsers.
917
stars
252
forks
source link
Values between "a quoted and escaped quote" and "a quoted value, that starts with the delimiter" are skipped #508
Parsing a valid (Rfc 4180) csv file, which contains "a quoted and escaped quote" ("""") and "a quoted value, that starts with the delimiter" (e.g. ";abc").
Using
-selectFields
-NormalizeLineEndingsWithinQuotes=false
the values between "a quoted and escaped quote" and "a quoted value, that starts with the delimiter" are skipped.
The problem does not occur with NormalizeLineEndingsWithinQuotes=true.
Version of Univocity
Problem
Parsing a valid (Rfc 4180) csv file, which contains "a quoted and escaped quote" ("""") and "a quoted value, that starts with the delimiter" (e.g. ";abc").
Using -selectFields -NormalizeLineEndingsWithinQuotes=false
the values between "a quoted and escaped quote" and "a quoted value, that starts with the delimiter" are skipped.
The problem does not occur with NormalizeLineEndingsWithinQuotes=true.
The problem appears to be caused by
AbstractCharInputReader.skipQuotedString(char quote, char escape, char stop1, char stop2)
which doesn't seem to properly handle "quoted and escaped quotes"
CSV-Data
A line, that contains a single quote (quoted and escaped with quote). The next quoted value starts with the delimiter.
e.g.
Example
Expected output
Actual Output