Believe we found the root issue. The code issues a getRecords call in batches of a configurable size to Amazon and expects the list of records to be returned in order. For example lets say we are requesting in batches of 1000 records. Amazon does provide the next 1000 sequence numbers but the list that is returned is not necessarily sorted. This causes the out of order inserts. The solution we have in place is to sort the result that Amazon returns.
"""
Processed 900 with no out of order inserts
Requesting records
java.lang.RuntimeException: OUT OF ORDER INSERT: last seq 49552833805435751671064410149559486681498920731233222658 is AFTER curent seq: 49552833805435751671064410147127127932433940966597984258
at com.amazonaws.services.kinesis.stormspout.KinesisHelperTest.test(KinesisHelperTest.java:48)
"""
Closing this issue for now. We still get this error, but my unit test was not properly resetting the iterator to the next shard itr. Ran it with the fix logic and did not detect any out of order records.
Believe we found the root issue. The code issues a getRecords call in batches of a configurable size to Amazon and expects the list of records to be returned in order. For example lets say we are requesting in batches of 1000 records. Amazon does provide the next 1000 sequence numbers but the list that is returned is not necessarily sorted. This causes the out of order inserts. The solution we have in place is to sort the result that Amazon returns.
Test case to prove the behavior: https://gist.github.com/geota/ed47ecdead08ab0cab66
Will send a PR soon
""" Processed 900 with no out of order inserts Requesting records java.lang.RuntimeException: OUT OF ORDER INSERT: last seq 49552833805435751671064410149559486681498920731233222658 is AFTER curent seq: 49552833805435751671064410147127127932433940966597984258 at com.amazonaws.services.kinesis.stormspout.KinesisHelperTest.test(KinesisHelperTest.java:48) """