Closed dignifiedquire closed 1 week ago
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2924/docs/iroh/
Last updated: 2024-11-13T22:50:56Z
Netsim report & logs for this PR have been generated and is available at: LOGS This report will remain available for 3 days.
Last updated for commit: 9e1bb5b
the spawn blocking was introduced in #1091, do you or @flub who approved it remember why you thought it was necessary?
From my memory spawn_blocking was introduced as a knee-jerk reaction for some behaviour we didn't understand. I recall looking several times at this and wondering if the threads overhead really was better than this tiny bit of computation... but didn't dare to touch it as it was a response to an incident. I'm all for removing it, we can do better now (I mean both evaluating and responding to incidents as well as writing a scalable server).
the spawn blocking was introduced in #1091, do you or @flub who approved it remember why you thought it was necessary?
Oh boy that old code had like task::spawn > spawn_blocking > spawn > spawn_blocking > spawn ...
Im guessing we shouldn't have perf issues with blocked/timed out STUN requests given things are async + the handle_stun_request
is in it's own task. I assume the sock.send_to
will eventually time out?
Quoting dig from the depths of the discord archive, #1091 was introduced because:
I also added some potential improvements to the stun server to make it scale better here not totally sure how much sense it makes though to spawn a task for each incoming req
Flub:
no way that scaling is the issue right now
Dig:
yes, but I wanted to make sure it wasn't
Sounds like this was some fairly optional change.
merging, if this turns out to be a problem, we have better ways looking at them now as @flub pointed out
Description
This resulted in memory accumulation , due to blocking threads not getting shutdown, once started. According to the docs from tokio, there is no way to cleanup these tasks/threads, once they have been started.
From the docs: https://docs.rs/tokio/latest/tokio/task/fn.spawn_blocking.html
The referred to limit is
512
by default, meaning with the default thread stack size of2MiB
, we are looking at1GiB
of memory, simply for keeping around these threads.Looking at https://docs.rs/tokio/latest/tokio/runtime/struct.Builder.html#method.thread_keep_alive these threads should get cleaned up after
10s
, but this does not seem to happen. So there seems to be some unintended lingering to happen, in either case.Breaking Changes
None
Notes & open questions
This is unfortuante, but the parsing of stun packets should be fast enough.
Change checklist