vibe.d Throughput and setIdleHandler

kyleingraham commented 4 weeks ago

I'm writing a D integration for NGINX Unit and want to make vibe.d's concurrency system available during requests. Without vibe.d I get the maximum throughput that I can measure (https://github.com/kyleingraham/unit-d-hello-world). With vibe.d I get half that (https://github.com/kyleingraham/unit-vibed-hello-world). I've been trying to figure out what's going on here.

The general architecture is:

NGINX Unit and vibe.d operate event loops on their own threads.
When requests are ready, Unit adds them to a queue and signals vibe.d's thread.
vibe.d responds to the signal by pulling requests from the queue and processing them.

From profiling it seems that my program spends most of its time in vibe.d event loop code:

I recently found that I can get maximum throughput by calling setIdleHandler with a delegate that always returns true. That results in constant idle CPU usage however. Is there a better way to get the throughput I'm looking for? Am I missing something with the way I've architected things?

s-ludwig commented 4 weeks ago

Seeing that the CFRunLoop part takes up a considerable amount of time, my first question would be whether that is actually needed. If the basic kqueue compatible parts are sufficient, using the "kqueue" configuration of eventcore looks like it could yield a 2x improvement. Native directory watchers are probably the main argument for the CFRunLoop implementation.

Having said that, the CFRunLoop callback logic is probably doing too many calls to CFFileDescriptorEnableCallBacks (which also looks quite heavy in the flame graph), but I couldn't find any other way to lay out the calls that doesn't lead to event loop starvation at some point. Apple's documentation, as usual, doesn't really help, unfortunately.

kyleingraham commented 4 weeks ago

Is there a way to use the kqueue configuration? And is the implication here that I might not see this behaviour on Linux? I’ll try that out and see. Thanks for the inside details here 🤜🤛

s-ludwig commented 4 weeks ago

You can add an explicit sub configuration to dub.json:

    "dependencies": {
        "vibe-d": "~>0.10.1",
        "eventcore": "~>0.9.34"
    },
    "subConfigurations": {
        "eventcore": "kqueue"
    },

Or you could pass --override-config=eventcore/kqueue to the dub invocation to try it out temporarily.

And is the implication here that I might not see this behaviour on Linux?

Judging from the numbers so far, I would guess so. At least on Linux everything runs through epoll with no need to cascade different event mechanisms.

kyleingraham commented 4 weeks ago

You were spot on. Switching to the kqueue configuration for eventcore brought performance back up to the stratosphere. Great configurability here.

Side question but how would I go about contributing to vibe.d? Pick any issue and grab it or is there a list of issues where attention would be most appreciated?

s-ludwig commented 3 weeks ago

Side question but how would I go about contributing to vibe.d? Pick any issue and grab it or is there a list of issues where attention would be most appreciated?

Since I'm currently still not able to dedicate a lot of time to my open-source projects, picking issues is probably the most effective approach right now. One day I'll make another complete pass over all open issues and planned features and collect things to prioritize, but until then I can only really concentrate on fixing concrete issues and looking into PRs.

Generally, I feel like HTTP performance and io_uring support on Linux would be the two general things that would bring the greatest benefits, but they are also both not particularly easy starting points. Then the still experimental HTTP/2 support (or a future QUIC+HTTP/3 integration) is probably the most notable missing component right now, so that might also be worth looking into.

vibe-d / vibe.d

vibe.d Throughput and setIdleHandler #2807