uniVocity-parsers is a suite of extremely fast and reliable parsers for Java. It provides a consistent interface for handling different file formats, and a solid framework for the development of new parsers.
917
stars
252
forks
source link
implicit limitation on max column name length? #438
Hi, we recently noticed a bug in Spark 3.x which depends on the latest version of univocity parser. Basically, we found that there is an implicit limitation on column name length in univocity(1024 chars by default). if you added a header longer than the limitation, you will get NPE (you could see the detailed analysis in the Spark PR)
in univocity code base, you could add the following unit test to reproduce (to get that NPE error mentioned in Spark PR)
+ @Test
+ public void testSuperLongHeader() {
+ CsvWriterSettings settings = new CsvWriterSettings();
+ settings.getFormat().setLineSeparator("\n");
+ StringBuffer sb = new StringBuffer();
+ for (int i = 0; i < 1025; i++) {
+ sb.append("a");
+ }
+ settings.setHeaders(sb.toString());
+ StringWriter out = new StringWriter();
+
+ CsvWriter writer = new CsvWriter(out, settings);
+ writer.writeHeaders();
+ List<String> row = new ArrayList<String>();
+ row.add("value 1");
+ row.add("value 2");
+ writer.writeRow(row);
+ writer.close();
+
+ assertEquals(out.toString(), "value 1,value 2\n");
+ }
NPE:
java.lang.NullPointerException: null
at com.univocity.parsers.common.AbstractWriter.submitRow(AbstractWriter.java:349)
at com.univocity.parsers.common.AbstractWriter.writeHeaders(AbstractWriter.java:444)
at com.univocity.parsers.common.AbstractWriter.writeHeaders(AbstractWriter.java:410)
at com.univocity.parsers.csv.CsvWriterTest.testSuperLongHeader(CsvWriterTest.java:638)
our question is: is such a limitation intentionally added? or it is actually a bug?
Hi, we recently noticed a bug in Spark 3.x which depends on the latest version of univocity parser. Basically, we found that there is an implicit limitation on column name length in univocity(1024 chars by default). if you added a header longer than the limitation, you will get NPE (you could see the detailed analysis in the Spark PR)
in univocity code base, you could add the following unit test to reproduce (to get that NPE error mentioned in Spark PR)
NPE:
our question is: is such a limitation intentionally added? or it is actually a bug?
cc @HyukjinKwon @viirya