Closed joshfester closed 5 months ago
Normal expectation is to see better performance than running on K8s.
I think I touched on this somewhere deep in the guides... but Lamby recommends running at 1vCPU which is 1792 memory. You dont need all the memory, you need the 1vCPU. Most other container orchestration systems leverage under provisioned min memory to overcommit resources and get scaling to work. This ends up with slower Rails apps and request queuing. Not saying a pro could not tune these things to get near Lambda, but I've seen even expert teams not do it well.
I can create a guide page on that if you are interested. In fact. Think I will. I need to document all this stuff for a talk too. How does that sound so far? Wanna ask more questions on the topic? Happy to discuss.
A page in the guide for this would be super helpful! I was looking into this as a potential solution for a startup. My first question was "is there a performance tradeoff?" because it seems like there might be some extra overhead/latency. I'm very experienced with scaling Rails apps, but very unfamiliar with Lambda.
I'm constantly building new companies and I always start with three options: self-host with Dokku, use a middleman like Cloud66/Hatchbox, or use a PaaS like Heroku/Render/Fly. Lamby is very appealing because it can scale to zero and has virtually (I think?) no limits.
Here's some questions off the top of my head:
I've got more, but I think that's just me needing to read through the Lambda docs to understand it better. A page in the Lamby docs that shows performance comparisons and any scaling limitations would give me the complete picture I need to make a decision
At some point in the future my curiosity will inevitably get the best of me and I'll try this out. When that happens I'd be happy to document my results if you want/need them
because it seems like there might be some extra overhead/latency
No, none. I've seen more latency in K8s figuring out auto scaling with memory or cpu which translates into request queue latency that adds up. Only thing would be cold starts, but that is mostly moot too. More so if you app is uses a lot. It disappears well past the 99th percentile metrics. BTW, speaking of cold starts, have you seen the new news about proactive inits? This make it even more of a none issue. But honestly, it never was. This is icing on the cake.
https://lamby.cloud/blog/2023/07/16/goodbye-cold-starts-hello-proactive-initilizations
Lamby is very appealing because it can scale to zero and has virtually (I think?) no limits.
Yea, that was a great list. I'm constantly trying to make Lamby more appealing to start ups and SaaS bootstrapers like yourself.
How does Lamby compare to a typical server running Puma?
Equal or better. The idea of Puma is that you have to get some number of workers (or threads) to maximize some Xyz value of memory and CPU. It's all a game of getting to some utilization of a single virtual "machine". With Lambda there is no Webserver... you just get Rack events. So every function is a new instance and the reverse proxy is API Gateway or Furls. I think I covered this well here:
https://speakerdeck.com/metaskills/the-case-for-rails-on-lambda-v1
What happens when you need more than that?
You do not. For Rails you give it 1vCPU and scale horizontally like every other person does it. In this case, you do not have to manage the scaling. The Lambda control plane (Firecracker) does that for you. I've never seen Rails need more. Even big large legacy apps.
Some benchmarks to elaborate on that would be amazing
Cool, you might see those in the deck I shared above.
When that happens I'd be happy to document my results if you want/need them
I appreciate that. I think I got most of the data... just need to wrap it up better in a single place. Of course, if you have any new info, outliers, etc... please share!
scale horizontally like every other person does
Let me qualify that... everyone scales horizontally... but you only need more than 1vCPU in other compute platforms because you are stuffing more puma workers, etc in a virtual machine. At the end of the day it is 6 of one and half dozen of the other. For Lambda, you are using distinct functions to truly scale horizontally. You really cant do it any other way... nor is there a need to. Hope that makes sense.
Are there any benchmark comparisons to see the response times compared to a traditional server running Rails and Puma?