uniVocity-parsers is a suite of extremely fast and reliable parsers for Java. It provides a consistent interface for handling different file formats, and a solid framework for the development of new parsers.
915
stars
251
forks
source link
CSV Reader does not escape ASCII control characters #512
import java.io.*;
import java.util.*;
import com.univocity.parsers.csv.*;
public class Test {
public static void main(String ... args){
CsvParserSettings settings = new CsvParserSettings();
settings.getFormat().setLineSeparator("\n");
settings.getFormat().setQuote('\u0012');
settings.getFormat().setQuoteEscape('\u0012');
// RAISE_ERROR // STOP_AT_CLOSING_QUOTE
UnescapedQuoteHandling u= UnescapedQuoteHandling.valueOf("RAISE_ERROR");
settings.setUnescapedQuoteHandling(u);
settings.setParseUnescapedQuotes(true);
CsvParser parser = new CsvParser(settings);
String line1 = "\u00127\u0012,\u0012EmbeddedDouble\u0012,\u0012field\u0012\u0012 t\u0012\u0012ext\u0012,\u0012field\u0012\u0012 t\u0012\u0012ext\u0012";
System.out.println("Input line: " + line1);
List<String[]> allLines = parser.parseAll(new StringReader(line1));
int count = 0;
for(String[] line : allLines){
System.out.println("Line " + ++count);
for(String element : line){
System.out.println("\t" + element);
}
System.out.println();
}
}
}
Error:
/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/bin/java -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=51890:/Applications/IntelliJ IDEA CE.app/Contents/bin -Dfile.encoding=UTF-8 -classpath /Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/jaccess.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/nashorn.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/jfxswt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home/jre/lib/rt.jar:/Users/nandini.r/IdeaProjects/JarTest/out/production/JarTest:/Users/nandini.r/Desktop/univocity-parsers-2.8.4.jar Test
Input line: 7,EmbeddedDouble,field text,field text
Exception in thread "main" com.univocity.parsers.common.TextParsingException: com.univocity.parsers.common.TextParsingException - Unexpected character 't' following quoted value of CSV field. Expecting ','. Cannot parse CSV input.
Internal state when error was thrown: line=0, column=2, record=0, charIndex=31, content parsed=field
Parser Configuration: CsvParserSettings:
Auto configuration enabled=true
Auto-closing enabled=true
Autodetect column delimiter=false
Autodetect quotes=false
Column reordering enabled=true
Delimiters for detection=null
Empty value=null
Escape unquoted values=false
Header extraction enabled=null
Headers=null
Ignore leading whitespaces=true
Ignore leading whitespaces in quotes=false
Ignore trailing whitespaces=true
Ignore trailing whitespaces in quotes=false
Input buffer size=1048576
Input reading on separate thread=true
Keep escape sequences=false
Keep quotes=false
Length of content displayed on error=-1
Line separator detection enabled=false
Maximum number of characters per column=4096
Maximum number of columns=512
Normalize escaped line separators=true
Null value=null
Number of records to read=all
Processor=none
Restricting data in exceptions=false
RowProcessor error handler=null
Selected fields=none
Skip bits as whitespace=true
Skip empty lines=true
Unescaped quote handling=RAISE_ERRORFormat configuration:
CsvFormat:
Comment character=#
Field delimiter=,
Line separator (normalized)=\n
Line separator sequence=\n
Quote character=
Quote escape character=
Quote escape escape character=null
Internal state when error was thrown: line=0, column=2, record=0, charIndex=31, content parsed=field
at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:395)
at com.univocity.parsers.common.AbstractParser.parseNext(AbstractParser.java:616)
at com.univocity.parsers.common.AbstractParser.internalParseAll(AbstractParser.java:545)
at com.univocity.parsers.common.AbstractParser.parseAll(AbstractParser.java:538)
at com.univocity.parsers.common.AbstractParser.parseAll(AbstractParser.java:525)
at Test.main(Test.java:33)
Caused by: com.univocity.parsers.common.TextParsingException: Unexpected character 't' following quoted value of CSV field. Expecting ','. Cannot parse CSV input.
Internal state when error was thrown: line=0, column=2, record=0, charIndex=31, content parsed=field
at com.univocity.parsers.csv.CsvParser.parseQuotedValue(CsvParser.java:458)
at com.univocity.parsers.csv.CsvParser.parseSingleDelimiterRecord(CsvParser.java:176)
at com.univocity.parsers.csv.CsvParser.parseRecord(CsvParser.java:108)
at com.univocity.parsers.common.AbstractParser.parseNext(AbstractParser.java:574)
... 4 more
Process finished with exit code 1
Error:
Issue is reproducible on the latest jar https://mvnrepository.com/artifact/com.univocity/univocity-parsers/2.9.1