Open ghost opened 8 years ago
Shards are being split using the kinesis autoscaler. The DynamoDB table shows the ones allocated as SHARD_ENDED
.
KCL distributes shards randomly and the new shards are not read until all of their parents are finished.
So, in a very unlucky situation, a consumer instance may get all the new childs, they all stay in TRIM_HORIZON state in DynamoDB side; and another consumer instance may get all the old (parent) shards and the load on that machine will be too high.
We are also using https://github.com/awslabs/amazon-kinesis-scaling-utils and this behaviour of KCL is troublesome for us.
I was having this same issue, and eventually came across a property named CleanupLeasesUponShardCompletion. Setting this to true will remove the closed shards (SHARD_END) from the lease table.
With only open shards in the lease table, only the open shards will be distributed among the workers, rather than the opened/closed shards being randomly distributed which resulted in workers with only leases to closed shards.
Thanks for reporting this. We are investigating the issue.
When running multiple workers (4) and after splitting shards (for a total of 8), I have one worker receiving no data. It simply sleeps. The shards IDs allocated to it are below the currently active ones. The other workers process as normal, with two of them having 3 allocations and the other having 2.
Is there a way to rebalance the workers so that each gets the greatest possible number of active shards? I would expect each worker to get 2 active shards allocated to them if 4 are present.