Since the addition of binance provider support in #170 i've noticed some serious cpu usage on weaker hw due the high data rates received on the binance websocket pipes. The L1 quote rates in particular can be instantaneous rates 500-1kHz which far and above now only most 60FPS display rates but even higher end stuff.
This issue can be considered a partial followup subtask for #180.
Toolz
probably worth checking out perf-timer's async support to figure out exactly which tasks are causing the most cpu gobble
Premises
for Qt UIs we shouldn't need to update graphics more then 60 FPS from a data feed, any more then this is just wasted cycles
the high rate feeds don't seem to pose any issues with processing latency/cpu usage so there should be no need to throttle for fsp, clearing or algorithmic purposes
currently inside chart code we've naively throttled the quote and book rates but, we're still task switching and ignoring too fast arrival rates; this is likely causing Qt task switches at greater then refresh rates which, for data viz purposes, is likely pointless.
rough measures show that CPU usage starts really climbing with > 5 symbols loaded in the local chart cache; you'll start to notice the usage in a sys monitor as well as light latency when scrolling/panning in the view box with the mouse.
in theory, more ws connections may mean more context switching overhead in trio, though it seems this would be doubting the OS's IO latency?
more ws connections i don't think helps resiliency much since if the actor thread dies we lose all the conns versus just the ws dying we lose a symbol, but there's no reason we can't just have multiple ws with all the subs going? It's worth some thinking.
Solution ideas to try
[x] get a better serialization backend in tractor?
i see no reason to not offer a throttle rate parameter to open_feed() that could then be used to limit the publisher-bus side send loop; it would in theory save bandwidth and would be N less loops cycling, where there are N UIs in need of such a rate limit (i.e. throttle at the fan-out source).
another idea could be to allow the open_feed() stream to be a new tractor 2-way stream and allow sending dynamic off / on messages to pause / resume the stream? I can't see this being latent and should be relatively simple to implement.
this would avoid more fan-outs then necessary and thus less msgpack serializes and IPC connections in general
we need to move fsps to the pikerd tree potentially anyway to support long running fsp triggered alerts / strats
the main tradeoff will be for latency which is in theory lower if you do all processing in parallel, though we're already relying on time-multiplexed feed processing by having single brokerd actors, so this likely has little effect at small scale.
[ ] try out multiple symbol subscriptions per websocket
this is something we probably want anyway for down-the-road data feed resiliency strategies
Since the addition of
binance
provider support in #170 i've noticed some serious cpu usage on weaker hw due the high data rates received on the binance websocket pipes. The L1 quote rates in particular can be instantaneous rates 500-1kHz which far and above now only most 60FPS display rates but even higher end stuff.This issue can be considered a partial followup subtask for #180.
Toolz
perf-timer
's async support to figure out exactly which tasks are causing the most cpu gobblePremises
Qt
UIs we shouldn't need to update graphics more then 60 FPS from a data feed, any more then this is just wasted cyclesQt
task switches at greater then refresh rates which, for data viz purposes, is likely pointless.> 5
symbols loaded in the local chart cache; you'll start to notice the usage in a sys monitor as well as light latency when scrolling/panning in the view box with the mouse.trio
, though it seems this would be doubting the OS's IO latency?Solution ideas to try
tractor
?msgspec
in https://github.com/goodboy/tractor/pull/212 but the improvement is negligible (as would be expected for encode only atm)data.feed
layer?open_feed()
that could then be used to limit the publisher-bus side send loop; it would in theory save bandwidth and would beN
less loops cycling, where there areN
UIs in need of such a rate limit (i.e. throttle at the fan-out source).open_feed()
stream to be a newtractor
2-way stream and allow sending dynamicoff
/on
messages to pause / resume the stream? I can't see this being latent and should be relatively simple to implement.pikerd
tree potentially anyway to support long running fsp triggered alerts / stratsbrokerd
actors, so this likely has little effect at small scale.tractor
bidir apis into each provider'sstream_quotes()
routine and launching another task to handle inbound subscription change messages such as in the newtractor
dynamic pub-sub examples