Closed kevin-vigor closed 2 years ago
Well, I suppose it was too much to hope for Microsoft to support C11 in 2021. I'll either disable multithreading on WIndows or figure out how to wield _InterlockedAdd. But tomorrow.
Windows build succeeds, I have no ability to test it further.
Linux and MacOS builds tested and confirmed working.
I believe this is now reviewable.
Ok! I'll try to get to it this weekend, that soon enough?
Ok! I'll try to get to it this weekend, that soon enough?
Oh yeah, no rush! Thanks!
This looks nice! I've tried this out on a big (12GB extracted) squashfs rootfs, doing a simple grep benchmark. It's significantly faster than the master branch, almost 40% reduction in runtime. However, the reason for that does not seem to have anything to do with threading, cause when I restrict squashfuse to a single core, it's faster than multiple cores (thread pinning improves caching behavior?):
$ taskset -c 0 ./pe.sh /bin/bash -c "time grep 'lib' -r /opt | wc -l" # using 1 core
66023
real 0m12.485s
user 0m3.144s
sys 0m1.189s
$ ./pe.sh /bin/bash -c "time grep 'lib' -r /opt | wc -l" # using all cores
66023
real 0m14.098s
user 0m4.446s
sys 0m1.650s
The result on master looks like this:
$ taskset -c 0 ./pe.sh /bin/bash -c "time grep 'lib' -r /opt | wc -l"
66023
real 0m19.522s
user 0m3.170s
sys 0m1.198s
$ ./pe.sh /bin/bash -c "time grep 'lib' -r /opt | wc -l"
66023
real 0m21.025s
user 0m4.523s
sys 0m1.733s
The pe.sh
script is running unshare
, squashfuse
without any flags on a squashfs file comressed with zstd, and chroot .. "$@"
.
Is this expected behavior? Is it maybe that this PR just sets some better defaults for squashfuse?
What is the status of this PR? It looks like #59 was merged a day after this was opened and that contained a lot of changes to the structure.
Is there a future for this PR or can someone else pick this up?
@haampie : with regard to your results above, note that squashfuse does not create parallelism; the only way to take advantage of multithreaded squashfuse is with multiple IO requests in parallel. "grep -r" is singlethreaded and does not benefit from multithreaded squashfuse.
ripgrep, on the other hand, is parallel by default. Using ripgrep as a test case I observe a significant improvement with this changeset in multithreaded mode.
(by "significant" I mean ~40% in some quick and dirty benchmarking, I wish it scaled linearly with CPUs or something magic like that but alas, Amdahl's Law is still a thing).
Replaced by new pull request #70 .
Once more, with feeling.