lynxthecat / cake-autorate

Eliminate the excess latency and jitter terrorizing your 4G, 5G, LTE, Starlink or other variable rate connection!
https://www.bufferbloat.net
GNU General Public License v2.0
269 stars 24 forks source link

is that actually relevant #174

Closed moeller0 closed 1 year ago

moeller0 commented 1 year ago

https://github.com/lynxthecat/cake-autorate/blob/948003e3dbebd7276933452790f36697466bcda7/cake-autorate_lib.sh#L85 Having had a look at the data, the read version is only slightly faster than the sleep module so from a code hygene and cleanliness perspective I certainly would prefer the module. However, on OpenWrt that module is made out of unobtainium unless we want to create and maintain a new bash-modules package. So the read versions main convincing argument is that it is: a) considerable faster than external sleep (has this actually been tested?) b) does not require to install the non-busybox sleep binary.

So this is mostly a request to change the comment to make it clear that module-sleep is mostly hypothetical and hence not a serious alternative.

rany2 commented 1 year ago

Hi, I've compared the sleep bash module and our version. Ours is faster by some magin, I can't explain why it's the case but I was expecting the bash module to be much faster somehow.

rany2 commented 1 year ago

a) considerable faster than external sleep (has this actually been tested?)

Yes and the results were posted in the openwrt forum. We do in fact have benchmarks for it.

moeller0 commented 1 year ago

Well, then add a link to that data, I can not find it with the forum search (https://forum.openwrt.org/t/cake-w-adaptive-bandwidth/135379/2195?u=moeller0 compares iputils-sleep with the read-method). My understanding is, going from external sleep to the read solution improves speed/overhead by one order of magnitude, base10, see: https://blog.dhampir.no/content/sleeping-without-a-subprocess-in-bash-and-how-to-sleep-forever but that does not tell us how the sleep-module fares against the read method. Yet the cpmment claims the read hack to be superior to both alternatives, and if you have data to show that you see at least an order of magnitude base2 difference that argument is fine. Why do I bring this up? Because I see segmentation faults in the read-method sleep function and honestly that does not fill me with confidence. And I prefer somewhat slower but no segmentation faults over the alternative. Now I am not using the current version, so this is not an actionable bug report for you, just an aside that rolling one's own replacement for time tested unix tools can be harder than one would think.

rany2 commented 1 year ago

I didn't write the comment myself but I compared the read hack to external and the bash module. The difference between it and external coreutils sleep was significant but for bash loadable sleep it was negligible but consistently slower than the read hack.

moeller0 commented 1 year ago

I didn't write the comment myself but I compared the read hack to external and the bash module.

OK, that was not obvious from the posted comment in the forum.

The difference between it and external coreutils sleep was significant but for bash loadable sleep it was negligible but consistently slower than the read hack.

And that is simply not a good enough rationale to roll one's own IMHO. My gut feeling is that neither iputils-sleep not builtin-sleep would segfaulted on me... our current sleep implementation is not checking its inputs in a meaningful way, well possible that the segmentation faults are a simple reaction on ill-formed inputs, I hypothesize that the real sleep alternatives are simply more tolerant to unexpected inputs. But again not a bug report and not your problem to fix. But as I said I accept the argument that bash-builtins are simply not available in OpenWrt.

moeller0 commented 1 year ago

Quick update, my testing over there: https://forum.openwrt.org/t/cake-w-adaptive-bandwidth/135379/2602?u=moeller0 seems to show the bash builtin come out as significantly faster/more precise than both alternatives:

moeller@work-horse:/usr/lib/bash$ which sleep
/usr/bin/sleep
moeller@work-horse:/usr/lib/bash$ time for ((i=1; i<=10000; i++)); do sleep 0.000000001; done

real    0m9.616s
user    0m6.666s
sys 0m2.983s
moeller@work-horse:/usr/lib/bash$ time for ((i=1; i<=10000; i++)); do read -t 0.000000001 <><(:); done

real    0m2.496s
user    0m5.727s
sys 0m2.382s
moeller@work-horse:/usr/lib/bash$ enable sleep
bash: enable: sleep: not a shell builtin
moeller@work-horse:/usr/lib/bash$ enable -f ./sleep sleep
moeller@work-horse:/usr/lib/bash$ enable sleep
moeller@work-horse:/usr/lib/bash$ time for ((i=1; i<=10000; i++)); do sleep 0.000000001; done

real    0m0.045s
user    0m0.045s
sys 0m0.000s
rany2 commented 1 year ago

Just so that we don't confuse ourselves in the future. Our version is actually really close to the bash module, the above benchmark has a flaw in that it spawns a subshell each time. We actually open the subshell once and keep reusing its fd. The more accurate results (from https://forum.openwrt.org/t/cake-w-adaptive-bandwidth/135379/2606?u=rany):

moeller@work-horse:/usr/lib/bash$ exec {fd}<><(:)
moeller@work-horse:/usr/lib/bash$ time for ((i=1; i<=10000; i++)); do read -t 0.000000001 -u $fd; done

real    0m0.064s
user    0m0.050s
sys 0m0.014s
moeller@work-horse:/usr/lib/bash$ time for ((i=1; i<=10000; i++)); do sleep 0.000000001; done

real    0m0.043s
user    0m0.042s
sys 0m0.002s
moeller@work-horse:/usr/lib/bash$
moeller0 commented 1 year ago

Our version is actually really close to the bash module

Yes, but that is my argument, the two solutions are close enough that apparently depending on the hardware one or the other has a slight edge, the comment however makes it appear as if the read-method comes out fastest unconditionally.

Plus the bash module actually does some input checking... the read method will accept negative numbers....

lynxthecat commented 1 year ago

I write about this back in the historical thread here:

https://forum.openwrt.org/t/cake-w-adaptive-bandwidth-historic/108848/2819?u=lynx

rany2 commented 1 year ago

the read method will accept negative numbers

It doesn't though:

$ read -t -1 
bash: read: -1: invalid timeout specification
lynxthecat commented 1 year ago

When I implemented it I also found the bash modules were not available in OpenWrt and also read link that stated for some reason the read method was faster than the bash builtin anyway. It seems like the latter turned out to be correct (I didn't test it at the time).

I take the point that there could be unwanted side effects here, but since the bash builtins are not available and because calling external binary is slow it is academic anyway.

rany2 commented 1 year ago

also read link that stated for some reason the read method was faster than the bash builtin anyway

I actually tested it myself and it turned out to be true in my case. My best guess is that I probably messed up building bash-modules, maybe I didn't set optimization flags...

moeller0 commented 1 year ago

when I ran:

time for ((i=1; i<=10000; i++)); do read -t -0.000000001 -u $fd; done

I got no error message at all while

time for ((i=1; i<=10000; i++)); do sleep -0.000000001; done

complained every iteration.... that was on ubuntu22LTS with current updates.

So no, these two are not identical in scope and error checking. And our sleep wrappers only gently improve that.

rany2 commented 1 year ago

@moeller0 Interesting, I could confirm this. This seems like a bug, it only happens in -1,0 range excluding -1

lynxthecat commented 1 year ago

By the way can't we just use read -t without the file descriptor? That works for me?

rany2 commented 1 year ago

@lynxthecat the whole point of the fd is that we don't want any input to stdin (or if stdin is closed) to impact us.

moeller0 commented 1 year ago

you could try read -t - < /dev/zero

rany2 commented 1 year ago

@moeller0 wouldn't that waste a ton of CPU and memory? It will need to store all these \0 into the buffer and process them constantly. Seems like the worst option :P

rany2 commented 1 year ago

@moeller0 As expected it uses up 100% of the CPU but oddly enough no memory spike, presumably \0 is treated in a special manner and gets ignored (I don't think you could have \0 in bash variables anyway, you'd need to encode them as hex and then use printf to use them)