Closed gjesse closed 1 year ago
Also hitting this problem, and a downgrade for us is not trivial at this point :(
Is this project dead? Can't help but feel the goal is to bounce everyone to https://aws.amazon.com/about-aws/whats-new/2020/11/now-you-can-use-amazon-kinesis-data-streams-to-capture-item-level-changes-in-your-amazon-dynamodb-table/
It's been feeling pretty dead for a long time. I wish AWS would either deprecate it officially or support it. We are looking at transitioning to kinesis.
I had the same error with KCL 1.14.3 and Kinesis Adapter 1.5.3. The DynamoDB Streams shard is said to be automatically recreated every few hours (about 4 hours), and this error starts to occur at that timing.
After reading the source code, it seems that the cause is that the Kinesis Adapter does not meet the specifications of the #isValidResult() check process that was added to KinesisDataFetcher in KCL 1.14.0.
KCL's KinesisDataFetcher#isValidResult() has the following comment.
// GetRecords result should contain childShard information. There are two valid combinations for the nextShardIterator and childShards // If the GetRecords call does not reach the shard end, getRecords result should contain a non-null nextShardIterator and an empty list of childShards. // If the GetRecords call does not reach the shard end, getRecords result should contain a null nextShardIterator and a non-empty list of childShards. // All other combinations are invalid and indicating an issue with GetRecords result from Kinesis service.
However, the Kinesis Adapter's AmazonDynamoDBStreamsAdapterClient#getRecords() implementation, it seems that we are not getting the child shards in the first place. Therefore, it does not meet the specification that KCL assumes, "If nextShardIterator is null, then childShards must be a non-empty list.
At first, this error could be avoided by using KCL 1.13.x before the checking process was implemented.
However, as it is, DynamoDB Streams consumer applications will have to continue using the old version of KCL.
I have come up with some ideas for a radical solution.
Currently, AmazonDynamoDBStreamsAdapterClient#getRecords() in Kinesis Adapter doesn't get the child shard information, but we can fix it to get it and set it in GetRecordsResult and return it.
I tried to fix it, but to get the child shards in the DynamoDB Streams API, use describeStream to get a list of shards, and then find the child shards of the shard in question. However, since only a ShardIterator is passed as an argument to getRecords(), the ID of the parent shard is not known, and it is not possible to determine which shard is the target child shard from the list of shards.
Therefore, we think this idea is not feasible.
When KCL is used in combination with Kinesis Adapter, it should not be checked by KinesisDataFetcher#isValidResult().
In this case, we can't make KCL depend on Kinesis Adapter, so how do we determine "used in combination with Kinesis Adapter"?
I think this can be solved by passing a flag to the constructor of KinesisDataFetcher that specifies whether or not to perform the check. (This flag can be specified in the KCL configuration.)
Do you have any other ideas?
Almost happy birthday for this issue, with zero comments from the maintainers..
Any updates?
https://github.com/awslabs/dynamodb-streams-kinesis-adapter/pull/44 https://github.com/awslabs/dynamodb-streams-kinesis-adapter/issues/42 https://github.com/awslabs/dynamodb-streams-kinesis-adapter/issues/50 https://github.com/awslabs/amazon-kinesis-client/issues/451
@aggarwal @hyandell @amcp
@gguptp was this fixed or why is this issue closed now?
We have released the newest dynamodb-streams-kinesis-adapter version 1.6.0, which is compatible with KCL 1.14.9 version
Thanks for the update!
@gguptp we've made the migration to dynamodb-streams-kinesis-adapter version 1.6.0 using KCL 1.14.9. However, we're still seeing the ERROR level logs about "GetRecordsResult is not valid" from com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask. But the stream processing does seem to be working. Can we suppress this error log with good conscience?
Please make sure StreamsWorkerFactory is getting used to initialize KCL worker
Ah yes, thanks for support @gguptp!! We were already on 1.5.3 but we were not creating our workers with StreamsWorkerFactory yet and I didn't read the older release notes. Now it's running just fine on 1.6.0 with 1.14.9! We're seeing a slight increase in CPU usage but also lower latency. All good! 🙏🏻
See https://github.com/awslabs/amazon-kinesis-client/issues/746 for more background:
Hello - yesterday I upgraded to 1.14.0 kcl client for our application that uses dynamodb streams for processing. Since then I've noticed these very consistent errors. we've seen 10s of thousands of these in just a few hours, and repeated for the same shard ids.
As best I can tell, a GetRecordsResult with a null NextShardIterator and no child shards is a valid response - in fact there is no field specified for child shards at all here: docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_streams_GetRecords.html
Using kcl 1.14.0, and creating a worker using dynamodb-streams-kinesis-adapter 1.5.2. The worker is setup using this method: https://github.com/awslabs/dynamodb-streams-kinesis-adapter/blob/master/src/main/java/com/amazonaws/services/dynamodbv2/streamsadapter/StreamsWorkerFactory.java#L44
I am not setting any special configuration other than the following, which I believe shouldn't be relevant.