Closed dbadami closed 3 years ago
Thank-you for reporting this issue. This is caused by a recent change to reduce starting position drift when consuming from multiple Kinesis shards. This is impacting DDB consumers. We will patch this issue in a 1.0.5
and include the fix in 2.0.0
. Since you are building from source, if you want to unblock yourself in the meantime, you can remove this if
condition:
Before:
if (sequenceNumber.equals(SENTINEL_LATEST_SEQUENCE_NUM.get())) {
// LATEST starting positions are translated to AT_TIMESTAMP starting positions. This is to prevent data loss
// in the situation where the first read times out and is re-attempted. Consider the following scenario:
// 1. Consume from LATEST
// 2. No records are consumed and Record Publisher throws retryable error
// 3. Restart consumption from LATEST
// Any records sent between steps 1 and 3 are lost. Using the timestamp of step 1 allows the consumer to
// restart from shard position of step 1, and hence no records are lost.
return StartingPosition.fromTimestamp(new Date());
} else if (SENTINEL_AT_TIMESTAMP_SEQUENCE_NUM.get().equals(sequenceNumber)) {
Date timestamp = KinesisConfigUtil.parseStreamTimestampStartingPosition(configProps);
return StartingPosition.fromTimestamp(timestamp);
} else {
return StartingPosition.restartFromSequenceNumber(sequenceNumber);
}
After:
if (SENTINEL_AT_TIMESTAMP_SEQUENCE_NUM.get().equals(sequenceNumber)) {
Date timestamp = KinesisConfigUtil.parseStreamTimestampStartingPosition(configProps);
return StartingPosition.fromTimestamp(timestamp);
} else {
return StartingPosition.restartFromSequenceNumber(sequenceNumber);
}
Thanks for the quick response and remediation steps. I will try them out.
Is there a timeline for the launch of 2.0.0
?
I cannot commit to any dates. We are currently testing the release internally and hope to finalise it within 2 weeks. I hope this works for you. Thanks.
This is fixed in v1.1.0 and v2.0.0
Hi,
I'm deploying a Flink application (1.11.1) and using a JAR of the amazon kinesis flink connector (pulled and built the JAR on December 7th, 2020). The application fails to start because of a validation exception from Ddb -- Ddb stream doesn't support AT_TIMESTAMP in its GetShardIterator API and from the stack trace the the connector is calling the API with this shard iterator type.
I'm using the FlinkDynamoDBStreamsConsumer in my application and it is being deployed to KDA using CDK. The configurations being passed to the consumer is the ARN of the Ddb stream, and "flink.stream.initpos": "LATEST". I also have a FlinkKinesisConsumer but from what I can tell it's not the cause of this issue (or at least I don't expect it to be as I don't expect it to call Ddb's API for a shard iterator).
I need to build the JAR from mainline because the 1.04 release doesn't have this change which was causing my deployments to fail earlier.
I've attached the exception that I'm seeing. Please let me know if there is any more information that is required from me.