spring-projects / spring-batch

Spring Batch is a framework for writing batch applications using Java and Spring
http://projects.spring.io/spring-batch/
Apache License 2.0
2.71k stars 2.34k forks source link

API should provide access to original input line in flat file outside of error scenario [BATCH-719] #2857

Closed spring-projects-issues closed 16 years ago

spring-projects-issues commented 16 years ago

Lucas Ward opened BATCH-719 and commented

Currently, the API for around flat file reading (FieldSet, FlatFileItemReader, etc) provides no way to access the original line outside of an error being thrown, in which case the original line can be accessed via a FlatFileParseException. However, there may be a need to log out an original line in the case of a failure when attempting to write the resulting record out to the database, or even if there wasn't an error at all, as described in the following forum post:

http://forum.springframework.org/showthread.php?t=57031

Users can workaround this now by concatenating the various fields in a FieldSet together, but this takes extra time that's unnecessary and would require the user to put delimiters in between records in the case of delimited input (it would be fine in FixedLength though)


Affects: 1.0.1

Issue Links:

spring-projects-issues commented 16 years ago

Jiri Mikulasek commented

The problem can be much more common in terms you can need access to original input (no matter what kind of) in case of failure wherever in the process (there could be need to access original input even in writer too). So I propose following solution - it is quite complex, so let me know if it is acceptable and possibly I will provide a patch then:

  1. New interface for readers, let's say ReportingItemReader declaring method ItemContext getItemContext() returning context of last read item.
  2. New class ItemContext as a container for last original input data and their start and end positions
  3. New interface for writers, let's say ReportingItemWriter declaring methods getItemContext() and setPredecessor(ReportingItemWriter)
  4. Change the behaviour by the following way: The first reader in reader chain (i.e. FlatFileItemReader) remembers the ItemContext (containing the original input line). Whenever some failure occurs each reader asks its delegate getItemContext() - thats the way how to get the original data to the place of failure. In case of failure in some writer, the writer asks its predecessor for ItemContext.
  5. The step must implement the process of passing ItemContext from last reader in chain to first writer in chain.
  6. There will be need to have some exceptions holding the ItemContext for case of failure - the existing ones can be changed or maybe it is better to create new ones?
  7. It is also question if the new interfaces should be implemented in existing ItemReader and ItemWriter implementations or separately.
spring-projects-issues commented 16 years ago

Robert Kasanicky commented

Given there is now the ResourceLineReader and LineTokenizer as well as FieldSetMapper are specialized ItemProcessors, maybe we can just advocate using this combination if the user is interested in the input line? I guess there's no special API needed.

spring-projects-issues commented 16 years ago

Dave Syer commented

We could, but unless it's a heap more work than we thought, I remember liking the ideas we sketched in Philly about field set meta data.

spring-projects-issues commented 16 years ago

Robert Kasanicky commented

BATCH-863 introduces new approach for string-to-item mapping that makes this issue obsolete.