spring-projects / spring-batch-extensions

Spring Batch Extensions
242 stars 258 forks source link

getColumnNames(Sheet sheet) → UnsupportedOperationException: Getting row by index not supported when streaming. #115

Closed adrian-pusty closed 1 month ago

adrian-pusty commented 1 year ago

Is it expected behaviour that using StreamingXlsxItemReader with BeanWrapperRowMapper causes UnsupportedOperationException?

StreamingSheet throws (in method: getRow(int rowNumber)) this exception when we try to retrieve column names.

I have a sample code snippet that reproduces this issue: https://github.com/adrian-pusty/spring-batch-extensions/commit/8c1f3baa74b455ee14cd20468b7299a898a571bc#diff-e8b601a5301d162b2c94ad1fed7ab3dcc638bd4d49eed4dff61db1074ae3a037

I think that having "header" field in BeanWrapperRowMapper would solve the it: https://github.com/adrian-pusty/spring-batch-extensions/commit/a727c46799539cf1d74a310335a5c2d813c95933#diff-0ac74f7f100e3eff66583af2aaf6350a51c3e650c198b699d1732da14fea0e71

mdeinum commented 1 year ago

This is indeed expected behavior as the result must be processed streaming and not retrieved on a row basis.

The base class for the item readers pre-configures the DefaultRowSetFactory that is in use. The DefaultRowSetFactory uses the RowNumberColumnNameExtractor to extract the column names. Creating a DefaultRowSetFactory with a StaticColumnNameExtractor should prevent this error, as that has a static list of column names.

This might be a documentation issue to improve the sample we have in the README as I currently don't see a way around reading the first row from the streaming sheet, without impacting the the whole reading (the stream will have moved one row already, skewing the skiplines functionality).