uniVocity-parsers is a suite of extremely fast and reliable parsers for Java. It provides a consistent interface for handling different file formats, and a solid framework for the development of new parsers.
// parent class info elided..
// List<String> errors = []
final protected CsvParserSettings configureCsvSettings(CsvParserSettings csvSettings,
BeanListProcessor beanProcessor) {
csvSettings.format.lineSeparator = System.lineSeparator()
csvSettings.headerExtractionEnabled = true
csvSettings.processor = beanProcessor
csvSettings.processorErrorHandler = new MyRowProcessorErrorHandler(errors)
return csvSettings
}
The BeanListProcessor has a trivial override of the beanProcessed() method, but I've commented that out when testing this and determined that it makes no difference in the outcome. I'm otherwise instantiating it using the BeanListProcessor(Class<T>, int) constructor.
I have input data that is missing one or more headers (whole columns). All of the properties in my target bean are annotated with @Parsed or another meta-annotation that uses it. And, as indicated previously, I'm supplying the target class to the processor so headers are being derived from its @Headers annotation (or at least, that's my understanding so far). That annotation has the sequence property defined with all expected header values.
All has been fine for months. Then, recently, I was tasked with updating the feature that uses this...
The moment I put @Validate into one of my meta-annotations, I have problems. Even with @Validate(nullable = true, allowBlanks = true) (which is technically what I want because all cells can be empty/null)... Regardless, tests start failing left and right. As a note to this point, I've tried different meta-annotations; one parses to BigDecimal via conversion, so I thought maybe I couldn't mix @Validate and @Convert, but then I tried it again on a String property with the same result.
So I've been down the rabbit hole of trying to debug this for a couple of days now and this is what I've found:
When the parser reaches DefaultConversionProcessor.applyConversions(String[], Context), conversion.applyConversion(index, row[index], convertedFlags) throws the above exception because index is set to -1 for any missing fields. These indices are discovered via initializeConversions(String[], Context) which calls Context.extractedFieldIndexes() and the -1 value is legitimate.
This process simply shouldn't do that. I'm not sure how or where this was supposed to be handled, but blatantly sending a -1 to an array is just bad practice... These values should either be removed or handled before calling conversions.applyConversions().
I also experimented with columnReorderingEnabled = false, but that was causing another side effect that I decided was not worth the time to investigate.
This is the part of the stacktrace that pertains to the library:
at com.univocity.parsers.common.Internal.throwDataProcessingException(Internal.java:62)
at com.univocity.parsers.common.Internal.process(Internal.java:57)
at com.univocity.parsers.common.AbstractParser.rowProcessed(AbstractParser.java:716)
at com.univocity.parsers.common.AbstractParser.parse(AbstractParser.java:152)
at com.univocity.parsers.common.AbstractParser.parse(AbstractParser.java:759)
// my call to csvParser.parse(InputStream) and prior calls are here...
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at com.univocity.parsers.common.DefaultConversionProcessor.populateReverseFieldIndexes(DefaultConversionProcessor.java:151)
at com.univocity.parsers.common.DefaultConversionProcessor.validateAllValues(DefaultConversionProcessor.java:164)
at com.univocity.parsers.common.DefaultConversionProcessor.applyConversions(DefaultConversionProcessor.java:132)
at com.univocity.parsers.common.processor.core.BeanConversionProcessor.createBean(BeanConversionProcessor.java:663)
at com.univocity.parsers.common.processor.core.AbstractBeanProcessor.rowProcessed(AbstractBeanProcessor.java:54)
at com.univocity.parsers.common.Internal.process(Internal.java:30)
... 7 more
...when fields are missing from the input data.
My
CsvParserSettings
looks something like this:The
BeanListProcessor
has a trivial override of thebeanProcessed()
method, but I've commented that out when testing this and determined that it makes no difference in the outcome. I'm otherwise instantiating it using theBeanListProcessor(Class<T>, int)
constructor.I have input data that is missing one or more headers (whole columns). All of the properties in my target bean are annotated with
@Parsed
or another meta-annotation that uses it. And, as indicated previously, I'm supplying the target class to the processor so headers are being derived from its@Headers
annotation (or at least, that's my understanding so far). That annotation has thesequence
property defined with all expected header values.All has been fine for months. Then, recently, I was tasked with updating the feature that uses this...
The moment I put
@Validate
into one of my meta-annotations, I have problems. Even with@Validate(nullable = true, allowBlanks = true)
(which is technically what I want because all cells can be empty/null)... Regardless, tests start failing left and right. As a note to this point, I've tried different meta-annotations; one parses toBigDecimal
via conversion, so I thought maybe I couldn't mix@Validate
and@Convert
, but then I tried it again on aString
property with the same result.So I've been down the rabbit hole of trying to debug this for a couple of days now and this is what I've found: When the parser reaches
DefaultConversionProcessor.applyConversions(String[], Context)
,conversion.applyConversion(index, row[index], convertedFlags)
throws the above exception becauseindex
is set to-1
for any missing fields. These indices are discovered viainitializeConversions(String[], Context)
which callsContext.extractedFieldIndexes()
and the-1
value is legitimate.This process simply shouldn't do that. I'm not sure how or where this was supposed to be handled, but blatantly sending a
-1
to an array is just bad practice... These values should either be removed or handled before callingconversions.applyConversions()
.I also experimented with
columnReorderingEnabled = false
, but that was causing another side effect that I decided was not worth the time to investigate.This is the part of the stacktrace that pertains to the library: