channl / dynamodb-lambda-autoscale

Autoscale DynamoDB provisioned capacity using Lambda
MIT License
313 stars 88 forks source link

AWS Officially releases native DynamoDB autoscaling #60

Open efenderbosch opened 7 years ago

efenderbosch commented 7 years ago

And just a couple days after I forked your project and deployed it...

https://aws.amazon.com/blogs/aws/new-auto-scaling-for-amazon-dynamodb/

mseank commented 7 years ago

It seems like Amazon's service doesn't really take throttling into account, however. We've experienced some issues with it.

tmitchel2 commented 7 years ago

And just after months of me reworking it.. 😣

goncaloneves commented 7 years ago

@tmitchel2 I come here to thank you for your work with this OS project. 👏 I am continuing using this, until I find AWS autoscaling mature enough and heavily tested. Probably auto-backup will come next from AWS?

(edit) And please continue supporting this, because I am certain that AWS will take a while until it can implement all the features of your autoscaling.

It seems to me that AWS is very simple and has high risk for messing up table partitioning. It would be cool to have your feedback on this.

efenderbosch commented 7 years ago

@mseank I think that if you have autoscaling enabled, there's a new burst capability that unused provisioned capacity from the past 5 minutes can be consumed in order to avoid throttling. Then hopefully the autoscaling kicks in and boosts the provisioned capacity enough.

tmitchel2 commented 7 years ago

@efenderbosch no I don't think your right there. The burst capability is something that has been there all along, it means your consumed capacity can burst above the provisioned for a very short period based upon a usage algorithm

schodemeiss commented 7 years ago

Like @goncaloneves, I wanted to thank-you for this project. We're using it in a heavy production work-load and it's always been reliable. We wont be moving to AWS's auto-scaling until it's implemented some more of the features of this project at the very least!

tmitchel2 commented 7 years ago

@schodemeiss appreciate it.

mseank commented 7 years ago

My team tried out the Amazon version, but we've decided to come back to this. It's more stable and easily used. Thanks, @tmitchel2

tmitchel2 commented 7 years ago

@mseank That's interesting, would you able to comment on specific stability issues? I did have a newer version in the works with better integrations and a potential paid for version but I'm on the fence about completing it.

mseank commented 7 years ago

@tmitchel2 Part of it is perhaps that the documentation for Amazon's is not very in-depth. Yours is much easier to follow. It kind of goes in line with what I was saying earlier in this thread, how it is difficult to control throttling with Amazon's in-house version. It was scaling, but when we would throttle, it wouldn't really register or do much of anythig. It would still be hitting our apparent set percentage, but not realizing that we weren't where we wanted to be. It also seems to struggle scaling down for some reason; again, this may be due to our setting the target utilization incorrectly.

They need more fields and better descriptions as to what they mean. The fields for your version are much more accessible and easily editable.

tmitchel2 commented 7 years ago

Going to use this issue to record commentary regarding Amazons implementation, good or bad.

For starters it seems like here they are suggesting a full 5 minutes of load to get the scaling to trigger https://forums.aws.amazon.com/thread.jspa?threadID=259029&tstart=0

efenderbosch commented 7 years ago

I think we might be coming back to this solution. The official AWS solution doesn't seem to react quickly enough to increased capacity requirements. We'll get spikes of throttles. It also doesn't seem to scale down as often as it should. I've seen periods of more than 12 hours where tables were not scaled down.

Additionally, since the AWS solution uses CloudWatch alarms, you will see an increase in that portion of your bill. Since this solution queries the table's capacity metadata and makes changes itself, there's no cost other than the Lambda function time.

I almost forgot. The official AWS solution is fundamentally broken when being used with CloudFormation. We ended up dropping autoscaling from CloudFormation and now have pre-deploy and post-deploy aws cli scripts that delete and then re-enable autoscaling.

chriskinsman commented 5 years ago

We are also still using this project for many of the reasons mentioned above.