Review update to Puma latest stable version

3scale / apisonator

Red Hat 3scale API Management Apisonator backend

https://3scale.net

Apache License 2.0

35 stars 27 forks source link

Review update to Puma latest stable version #195

Closed davidor closed 3 years ago

davidor commented 4 years ago

We are using our own fork (https://github.com/3scale/puma) that solves a performance issue. That problem has been solved in upstream, so we should try to update.

davidor commented 4 years ago

I tried the latest release of Puma, v4.3.5. It turns out that the performance issue is still there. Also, our patch no longer works. I tried a few versions of Puma until I found the one that broke our patch, v3.9.0, more specifically, the PR #1278 in Puma.

In the latest version, applying our patch and reverting that PR seems to be enough to fix the performance issue. I still need to study that PR and understand why it has a negative impact in Apisonator. I'll post my findings here.

unleashed commented 4 years ago

/me waiting for your findings, detective!

davidor commented 4 years ago

From my understanding, that PR changes how the Puma workers handle connections. In summary, instead of accepting requests and leaving them in a queue pending to be processed by the workers, now Puma only accepts a connection if there's a thread available. I guess that works in the majority of use cases, but not for ours because we only have 1 thread per worker.

In previous versions of Puma there was a "queue_requests" param to control this. Now it's present but it does no have any effect.

In practice what happens is that even with a moderate number of connections, many requests will time-out, which does not happen in previous releases with our patch nor when using Falcon.

What do you think @unleashed ? Should we update our fork to the latest version of Puma and revert that commit and/or add back the "queue_requests" param?

unleashed commented 4 years ago

So there is no buffering of accepted requests, which is useful as peaks will happen and timeouts can be avoided by introducing a bit of latency. But then this breaks the clustered mode with single-thread processes, which could just be a bug unless the maintainers have decided Puma no longer supports this.

Yes, I think in the short term we will want to undo that and use our own handling. I don't know how much the reactor code has changed, but both of us were familiar with it and we can probably come up with a quick fix. But I'd say that if they haven't dropped support for clustered mode with single-thread processes this is a bug worth reporting.

Are you willing to give it a go?

davidor commented 4 years ago

I don't think the maintainers consider this a bug. In the description of the PR it's mentioned that several of the authors gathered in a conference and came up with this PR. I guess it improves things for the most typical deployment scenario, which is several workers with multiple threads for each.

In the PR someone already pointed out that the "queue_requests" param that I mentioned in a previous comment did no longer have any effect.

davidor commented 4 years ago

I tried the latest release, 5.0 and can confirm that the behaviour is the same.

unleashed commented 4 years ago

The issue still exists, ok. So the next step is to port our fixes to 5.0 and try that out.

davidor commented 3 years ago

Updated to 4.3.7 in https://github.com/3scale/apisonator/pull/261