octokit / webhooks.js

GitHub webhook events toolset for Node.js
MIT License
310 stars 79 forks source link

[MAINT]: Investigate performance loss due to data conversions #940

Open Uzlopak opened 10 months ago

Uzlopak commented 10 months ago

Describe the need

I think there is a significant performance loss happening in this module.

Lets say we run the webhook middleware not in a cloud function. We get the payload, is a buffer and transform it as a string. Then we pass it to the hashing function, which is basically transforming it back to a buffer to hash it. When we validated, we do a JSON.parse.

What we could do is to keep them buffers, till we need it to be a string for JSON.parse.

In gcf we have request.body and request.rawBody. So we dont need to call JSON.parse.

Thats what I wanted to achieve in #936

Yes having a "getPayload" function is nice to encapsulate, but it is a potential performance bottleneck.

SDK Version

No response

API Version

No response

Relevant log output

No response

Code of Conduct

github-actions[bot] commented 10 months ago

👋 Hi! Thank you for this contribution! Just to let you know, our GitHub SDK team does a round of issue and PR reviews twice a week, every Monday and Friday! We have a process in place for prioritizing and responding to your input. Because you are a part of this community please feel free to comment, add to, or pick up any issues/PRs that are labled with Status: Up for grabs. You & others like you are the reason all of this works! So thank you & happy coding! 🚀

G-Rath commented 10 months ago

Can you showcase a real world example of this library being a performance bottleneck? i.e. where the number of requests that can be proceeded in a reasonable period is significantly low enough that it requires a sizable cost to offset by vertical scaling.

I ask because while I think like a lot of us I enjoy finding cheap and easy ways to make a section of code process 5k more requests per second, in reality we're typically talking about going from 50k to 55k ops per second in a very specific part of the code which I don't think is really comparable to the real world since you'd need 1. GitHub to send you 50k+ requests per second (which I'm pretty sure they'd self-throttle anyway) and 2. you should be vertically scaling which would x2 the amount of requests you could process at a time (+ ideally you should have at least two instances running in a production setup, meaning you're already x2 the benchmark).

Uzlopak commented 10 months ago

To be honest, my view is probably biased. If I would implement this with fastify, I probably would expect about 40k ops/s on my machine. But running probot node-middleware I just see about 7 k ops/s. So somewhere a significant part of the performance is lost.

A flamegraph shows, that e.g. verifyAndReceive is the hottest part of the whole lifecycle of a request. So one step is to reduce unnecessary conversions.

The other one is to investigate how we can reduce the overhead of the verification.

E.g. Objects are passed to javascript functions by Reference but primites by Value. So if we pass around strings instead of buffers(which are objects), we potentially force node to duplicate strings. So internally we should just pass around Buffers.

Also e.g. the secret... maybe allow the use of crypto.createSecretKey too. I am currently investigating the use of createSecretKey in fastify-cookie and it shows about 10% more performance (before: 40 k ops/s, after 45 k ops/s).

G-Rath commented 10 months ago

But running probot node-middleware I just see about 7 k ops/s. So somewhere a significant part of the performance is lost.

That definitely is worth digging into - still keep in mind that we're talking about 7k per second which, until someone provides a reasonable example of where/how GitHub might send 5+k requests in a single second, should be "good enough", but we should definitely at least have a deeper (documented) understanding of why we're that slow (and of course hopefully identify some optimizations on the way that can get that number up 😄)

G-Rath commented 10 months ago

fwiw I think the best place to start would be to put together a benchmarking repository - that way we can properly account for all the components involved in our measurement and quickly prototype possible optimizations across all those components.

wolfy1339 commented 10 months ago

To be honest, my view is probably biased. If I would implement this with fastify, I probably would expect about 40k ops/s on my machine. But running probot node-middleware I just see about 7 k ops/s. So somewhere a significant part of the performance is lost.

The goal here is not to require any specific framework of the likes of express or fastify. It's meant to be the most universal possible.

E.g. Objects are passed to javascript functions by Reference but primites by Value. So if we pass around strings instead of buffers(which are objects), we potentially force node to duplicate strings. So internally we should just pass around Buffers.

That can probably be a feature implementation that can be easily addressed

Also e.g. the secret... maybe allow the use of crypto.createSecretKey too. I am currently investigating the use of createSecretKey in fastify-cookie and it shows about 10% more performance (before: 40 k ops/s, after 45 k ops/s).

We'd have to make sure that we maintain web compatibility, by also provide provisions for CryptoKey objects

wolfy1339 commented 10 months ago

For web crypto the equivalent to Node's crypto.createSecretKey() would look like this

async function importKey(secret) {
    let encoder = new TextEncoder();
    const keyBuffer = encoder.encode(secret).buffer;
    return await crypto.importKey("raw", keyBuffer, { name: "HMAC", hash: "SHA-256" }, false, ["verify"]);
}
Uzlopak commented 10 months ago

@wolfy1339

I just used fastify for comparison and did not want to imply that we should use it as backend. But we need some reasonable numbers. E.g. pure fastify hello world is on my machine about 70 k ops/s. But I have about 40 k ops/s in fastify-cookie, where we call createHmac on every request.