ninenines / gun

HTTP/1.1, HTTP/2, Websocket client (and more) for Erlang/OTP.
ISC License
898 stars 231 forks source link

Performance question #182

Closed 9mm closed 5 years ago

9mm commented 5 years ago

I originally posted this on machine_gun because that's the wrapper I'm using, but the underlying lib he uses is gun obviously.

I was thinking about reposting it here but for simplicity it might be easier for me to link it.

Basically I'm trying to get gun to scale to 25,000 requests/second and even when its only 200 RPS on a single machine, the CPU is going insane.

https://github.com/petrohi/machine_gun/issues/10

Do you have any ideas on this? is it normal for gun to use so much CPU (or request handling in general). I'm kind of not sure what to do, its making it very difficult to scale vertically to handle our load

essen commented 5 years ago

The scheduler utilization doesn't seem insane in the pictures you posted, it looks pretty low. Note that Erlang schedulers will do a busy wait when waiting for work so it's possible that's part of what you're seeing.

9mm commented 5 years ago

Ok Here's an interesting turn of events.

I started trying to reproduce everything locally. Petrohi recommended I put the timeout at 10_000 ms to see what happens. It was then that I noticed this:

If I use machine gun/gun to request the endpoint 10 times, it hangs for ~5-8 seconds once every few refreshes. These are the response times:

4636, 97, 459, 101, 102, 4690, 4538, 102

If I take the exact url and put it in chrome and refresh, I get these times

244, 102, 207, 243, 208, 211, 255, 149, 202, 276, 102

So something about the library is 'falsely' causing the response to be horribly delayed.

I didnt notice it with the other 2 endpoints, but I think the problem is still there, it just requires more load (like in production) before the problem presents itself

essen commented 5 years ago

Can you show the code?

9mm commented 5 years ago

i just moved on this issue is just crazy