Open thom2batki opened 1 year ago
I am using a LOT parallel jobs and I found the same problem when migrating to the latest EasyBatch release. I prefer using PoisonRecords to achieve a finer control over the entire processing flow.
Here is how I solved the problem
Downloaded previous EasyBatch version source code
Rebuilt the following 5 classes from the original source code to a custom package
org.mypackage.PoisonRecordFilter
org.mypackage.PoisonrecordBroadcaster
org.mypackage.PoisonRecord
org.mypackage.PoisonBlockingQueueRecordReader
org.mypackage.PoisonBlockingQueueRecordWriter
Reintroduced the previous PoisonRecord based functionalities using the above classes in my package
Hope this may help.
When using BlockingQueues for processing incoming records, the writer-job will not execute, if the batchsize is not reached.
I have following Code:
As you can see, batchsize is defined as 5 for all jobs and 4 records need to be processed. In this case, all records will be read and processed, but never be written.
There are basically three different behaviours, depending on batchsize and number of records:
It seems that the writer-job is closing the queue to early and not waiting for incoming records anymore. In earlier versions of easy-batch PoisonRecords were used to sync between jobs. Is there anything comparable in actual versions? I don't want to just define a _QUEUETIMEOUT , which does not feel like a clever solution for this problem.
Because of this fact, the provided easy-batch-tutorial is not working properly:
https://github.com/j-easy/easy-batch/blob/master/easy-batch-tutorials/src/main/java/org/jeasy/batch/tutorials/advanced/parallel/ForkJoin.java
Here, all records are getting processed by the workingJobs, but they will never be written to Std.out by the defined StandardOutputRecordWriter<>().
Can someone help me out with this issue or someone out there dealing with the same problem?