Closed kyleingraham closed 4 weeks ago
Seeing that the CFRunLoop
part takes up a considerable amount of time, my first question would be whether that is actually needed. If the basic kqueue
compatible parts are sufficient, using the "kqueue"
configuration of eventcore looks like it could yield a 2x improvement. Native directory watchers are probably the main argument for the CFRunLoop
implementation.
Having said that, the CFRunLoop
callback logic is probably doing too many calls to CFFileDescriptorEnableCallBacks
(which also looks quite heavy in the flame graph), but I couldn't find any other way to lay out the calls that doesn't lead to event loop starvation at some point. Apple's documentation, as usual, doesn't really help, unfortunately.
Is there a way to use the kqueue configuration? And is the implication here that I might not see this behaviour on Linux? I’ll try that out and see. Thanks for the inside details here 🤜🤛
You can add an explicit sub configuration to dub.json:
"dependencies": {
"vibe-d": "~>0.10.1",
"eventcore": "~>0.9.34"
},
"subConfigurations": {
"eventcore": "kqueue"
},
Or you could pass --override-config=eventcore/kqueue
to the dub invocation to try it out temporarily.
And is the implication here that I might not see this behaviour on Linux?
Judging from the numbers so far, I would guess so. At least on Linux everything runs through epoll with no need to cascade different event mechanisms.
You were spot on. Switching to the kqueue configuration for eventcore brought performance back up to the stratosphere. Great configurability here.
Side question but how would I go about contributing to vibe.d? Pick any issue and grab it or is there a list of issues where attention would be most appreciated?
Side question but how would I go about contributing to vibe.d? Pick any issue and grab it or is there a list of issues where attention would be most appreciated?
Since I'm currently still not able to dedicate a lot of time to my open-source projects, picking issues is probably the most effective approach right now. One day I'll make another complete pass over all open issues and planned features and collect things to prioritize, but until then I can only really concentrate on fixing concrete issues and looking into PRs.
Generally, I feel like HTTP performance and io_uring support on Linux would be the two general things that would bring the greatest benefits, but they are also both not particularly easy starting points. Then the still experimental HTTP/2 support (or a future QUIC+HTTP/3 integration) is probably the most notable missing component right now, so that might also be worth looking into.
I'm writing a D integration for NGINX Unit and want to make vibe.d's concurrency system available during requests. Without vibe.d I get the maximum throughput that I can measure (https://github.com/kyleingraham/unit-d-hello-world). With vibe.d I get half that (https://github.com/kyleingraham/unit-vibed-hello-world). I've been trying to figure out what's going on here.
The general architecture is:
From profiling it seems that my program spends most of its time in vibe.d event loop code:
I recently found that I can get maximum throughput by calling
setIdleHandler
with a delegate that always returnstrue
. That results in constant idle CPU usage however. Is there a better way to get the throughput I'm looking for? Am I missing something with the way I've architected things?