Open killercup opened 5 years ago
Common issues and solutions around fast I/O
A topic that comes to mind is non-blocking reads/writes to stdin/stdout. I believe there might be crates for this in the Tokio family, but I have no idea of their quality / performance profiles. Perhaps this is a topic worth exploring? (not to only focus on this, but it's something that I've been wondering for a while now; perhaps it might be worth a mention).
how to profile CLI apps (not a full guide but good pointers)
Are you thinking tools such as perf(1)
?
A topic that comes to mind is non-blocking reads/writes to stdin/stdout.
Oh, very good point! We should look into this.
I don't know what the tokio ecosystem has in store for stdout, but one of the design goals of convey is to be super easy to use in multi-threaded code. This includes making all writes async by performing them on a separate thread. (I haven't benchmark this, however, as the current implementation is in a "it works, refactoring coming soon" stage).
I think it might be interesting to see if we can write a quick benchmark comparing code with println in the same thread against code that instead sends a message to a thread that prints.
Are you thinking tools such as perf(1)?
Yep! I've been meaning to look into ways to profile code easily, and cross-platform. E.g., I have saved links to this, this, and some tutorials on using Instruments.app and dtrace, but I've not found a tutorial that explains in 5min how to find the slow parts of a program (which may not even be possible, but I'd like to try at least).
but I've not found a tutorial that explains in 5min how to find the slow parts of a program (which may not even be possible, but I'd like to try at least).
I've got a flame(1)
script just for this. It runs until the script it's running exits, then opens a flamegraph in your browser. Linux only tho.
usage
$ flame cargo bench # to profile `cargo bench`
flame.sh
#!/bin/bash
set -x
perf record -F 99 -g "$@"
perf script > /tmp/out.perf
stackcollapse-perf /tmp/out.perf > /tmp/out.folded
outfile="/tmp/$(date +%F-%T)-flamegraph.svg"
flamegraph /tmp/out.folded > "$outfile"
rm perf.data /tmp/out.perf /tmp/out.folded
xdg-open "$outfile"
This requires perf and perf-tools to be installed.
We might also want to mention https://github.com/sharkdp/hyperfine and https://github.com/ferrous-systems/flamegraph
A good solution I've found is if your program is going to call print!
/println!
a lot of times, replacing those calls with write!
and writing into a std::io::BufWriter
bound to io::stdout
will reduce printing to the screen to a single syscall making it a lot faster in most cases.
Inspired by this comment and rust-cli/team#29 I've been thinking about adding an in-depth chapter for performance considerations.
The structure would be something like this: