spring-projects / spring-batch

Spring Batch is a framework for writing batch applications using Java and Spring
http://projects.spring.io/spring-batch/
Apache License 2.0
2.73k stars 2.35k forks source link

Enhancements to make it easier to opt out of the scan for failed items after an error in the item writer [BATCH-1511] #2071

Open spring-projects-issues opened 14 years ago

spring-projects-issues commented 14 years ago

Sanjeevkumar opened BATCH-1511 and commented

I had a requirement of reading the file , processing it and then posting the the processed record to a table. At the same time I had to delete the data from another table. However the requirement was if the deletion fails, insertion should still happen successfully. Hence i decided to use a requires new transaction attribute for each of these method calls.

Also we had a requirement that job should never be terminated no matter what exception is thrown. At the same time all the skipped records have to be logged to the custom audit tables.

However we faced a problem when trying to call insert and delete methods from itemwriter.write() method. When the delete failed, insert would work fine. However any exception thrown in deletion would trigger the retry since we had the skip configured and would call the insert again thus resulting in duplicate transactions. The reply from the Spring Batch team was skip is a retry with limit = 0 and expected behavior.

However there might be a possibility when i would want to continue the job run irrespective of exception thrown till the threshold limit of skip is reached. However I do not want the retry to run for any record that gets skipped. Since I am auditing all the records that were skipped i do not want the job to waste the time in retrying every skipped record. Can we have a new feature in Skip where i can skip the record but i do not want the retry to happen at all.

To conclude i want to skip every record that fails and i will audit it. But i do not want the job to waste time retrying every record that fails.


Reference URL: http://forum.springsource.org/showthread.php?t=84486

Sub-tasks:

spring-projects-issues commented 14 years ago

Dave Syer commented

I think there is still a misunderstanding here. A skip is still (and always probably will be) a recovery option for an exhausted retry. What you find difficult, I think (but you must confirm), is that an error in an ItemWriter leads to the items in the chunk being scanned for failures, and the item writer is called again with each of the items in the failed chunk in turn? So it's the scanning for failures that is making life difficult, nothing to do with retry at all?

spring-projects-issues commented 14 years ago

Sanjeevkumar commented

Hi Dave, I am not completely aware of the inner workings of the itemwriter. but to explain my situation i am trying to skip records that fail for any reason irrespective of the exception thrown. It might be possible that itemwriter is called again with each of the items in the failed chunk in turn. Under any circumstances i do not want the itemwriter to be called again if the any item fails. Is there a way to control this already?

Thanks,

Regards Sanjeev Tarnal

spring-projects-issues commented 14 years ago

Dave Syer commented

You can set the commit interval to 1 (might not be optimal for performance), or you can put the code that fails in an item processor (usually the best option if you are processing item-by-item anyway). Or you can redesign your writer to make it idempotent (so it would have to check before doing anything that it wasn't duplicating any effort). Or you can catch and recover from the exception inside the writer. Those are your options currently.

If this issue is going to go anywhere you have to try and understand what it means. The best the framework could do would be to skip the whole chunk if there was an error in the writer. We have resisted this so far because in most (but not all) cases failure in a writer is attributable to a single item and the others can be processed cleanly.

spring-projects-issues commented 14 years ago

Sanjeevkumar commented

Hi Dave, The commit interval is already set to 1. Here are some of the excerpts from my job xml and the itemwriter.

\ \ \ \<chunk reader="proSamDisBannerFileItemReader" processor="proSamDisBannerProcessor" writer="proSamDisBannerWriter" commit-interval="1" skip-limit="1000"> \ java.lang.Exception \ \ \ \ \ \ \

   public void write(List<? extends ProSamDisbursement> proSamDisbursements)
        throws Exception {
for (ProSamDisbursement proSamDisbursement : proSamDisbursements) {
        StudentReceivableBean studentRecievableBean = arAdapter
                .convertToAR(proSamDisbursement);
        studentRecievableDao.addStdnAcctTransSource(studentRecievableBean);
        memoExtDao.deleteMemo(proSamDisbursement);
}

}

Both studentRecievableDao.addStdnAcctTransSource(studentRecievableBean); and memoExtDao.deleteMemo(proSamDisbursement); will be part of new transactions and hence failure in delete should not cause the insert to rollback. But the addStdnAcctTransSource is reprocessed again when memoExtDao.deleteMemo(proSamDisbursement); fails although the commit interval is set to 1 and results in duplicate entries.

This is where i am getting confused. Is it the skip that is doing the reprocessing or the itemwriter that scans the chunk for the failed items and reprocesses them?

Anyways as per you, this is something which should be handled by me right?

Thanks in advance.

Regards Sanjeev

spring-projects-issues commented 14 years ago

Dave Syer commented

I just checked and the commit-interval=1 optimization is missing (I thought it was there so it must have been at some point). It could certainly be added. It would be better in nearly all cases such as yours (where the writer doesn't have any batch optimizations) to use an ItemProcessor and standard (larger) commit-interval.