BurntSushi commented 5 years ago

I feel like a pretty common pitfall for beginning Rust programmers is to try writing a program that uses println! to print a lot of lines, compare its performance to a similar program written in Python, and be (rightly) baffled at the fact that Python is substantially faster. This occurred most recently here: https://www.reddit.com/r/rust/comments/bl7j7j/hey_rustaceans_got_an_easy_question_ask_here/emx3bhm/

The reason why this happens is because io::Stdout unconditionally uses line buffering, regardless of whether it's being used interactively (e.g., printing to a console or a tty) or whether it's printing to a file. So if you print a lot of lines, you end up calling the write syscall for every line, which is quite expensive. In contrast, Python uses line buffering when printing interactively, and standard block bufferring otherwise. You can see more details on this here and here.

In my opinion, Rust should adopt the same policy as Python. Indeed, there is even a FIXME item for this in the code:

https://github.com/rust-lang/rust/blob/ef01f29964df207f181bd5bcf236e41372a17273/src/libstd/io/stdio.rs#L401-L404

I think this would potentially solve a fairly large stumbling block that folks run into. The CLI working group even calls it out as a performance footgun. And also here too. Additionally, ripgrep rolls its own handling for this.

I can't think of too many appreciable downsides to doing this. It is a change in behavior. For example, if you wrote a Rust program today that printed to io::Stdout, and the user redirected the output to a file, then the user could (for example) tail that output and see it updated as each line was printed. If we made io::Stdout use block buffering when printing to a file like this, then that behavior would change. (This is the reasoning for flags like --line-buffered on grep.)

cc @rust-lang/libs

BurntSushi commented 5 years ago

cc @killercup @kbknapp as other folks that might have opinions here.

sfackler commented 5 years ago

If we're worried about regressing people that are depending on it being line buffered, we could minimally have methods on Stdout/Stderr to switch it between line and block buffering.

alexcrichton commented 5 years ago

FWIW I personally continue to feel that we can do this at any time (change libstd's buffering strategy on non-TTY stdout/stderr streams) and I agree with @sfackler that if breakage arises we can work around it with methods and such.

kbknapp commented 5 years ago

I would be very much in favor of at least the minimal route of giving Stdout/Stderr the option to switch between line and block buffering.

Another slightly less minimalist approach is to use block buffering by default on print!("..") and to prominently displaying the characteristics of both macros in the docs. The downside being to change println!("..") calls to print!("..\n") is a multi cursor movement. A different approach is to add a pythonesque opt-in version of println!, ignoring the exact syntax as purely an example println!(buf=true, "..") which I believe could be done in a backwards compat way, and isn't a multi-cursor movement.

In general I'd like to switch wholesale, as it's one of the very common footguns I see.

Lonami commented 5 years ago

If we're worried about regressing people that are depending on it being line buffered […]

I wouldn't worry about this unless the documentation explicitly states the current behaviour (e.g. always line-buffered). If it's not documented, it's like relying on implementation details (which are subject to change).

BurntSushi commented 5 years ago

I don't think we specify the behavior. But even if we don't, and we want to make this change (it sounds like folks agree we should), we should go into it while being considerate of behavioral changes to existing code. The letter of the law is important, but so is the spirit.

canadaduane commented 5 years ago

Just to document further agreement with @BurntSushi that this is a common pitfall--here I am, a new user, doing it today, and asking around for help :)

https://users.rust-lang.org/t/why-is-this-rust-loop-3x-slower-when-writing-to-disk/30489

Lokathor commented 4 years ago

I'd be most in favor of simply a method to switch to block buffer mode. It's something that keeps the default case simple, and if you notice performance is bad you can opt into specific behavior. The same idea as being able to lock stdout manually to avoid repeatedly locking it.

I would be against trying to auto-detect the program mode and then using that to decide. Particularly, I absolutely want my interactive programs to be able to use block output and manual flushing.

Lucretiel commented 4 years ago

@Lokathor what about doing both? I think it's a common (and reasonable) default behavior of many other languages to do line-buffered on a terminal, and block-buffered otherwise. We could do the same, but then also add something like:

impl Stdout {
    // These functions do not cause any flushes or i/o interaction of any kind;
    // they simply set a flag that is consulted on each call to `write`. So,
    // transitioning to line_buffered wouldn't try to flush existing unflushed
    // lines until more writes come in (or a manual flush(), obviously).
    fn force_line_buffered(&mut self);
    fn force_block_buffered(&mut self);
}

I'm interested in tackling an implementation for this; would a change like this be considered significant enough that I should write an RFC for it first, to hash out the specific details, or could I write a draft PR and have the discussion take place in there?

Lokathor commented 4 years ago

I'm not on any team, but asking on the Rust Zulip for T-Libs might be your best starting place.

Lucretiel commented 4 years ago

I've started an implementation of this; I'll tag it in the relevant Pull Requests as I file them.

calebstewart commented 3 years ago

I'm not sure if this is the right place, but I stumbled on this issue when trying to find a way to disable line-buffering on stdout. I see there was a pull-request recently merged and some talk above about possibly adding a method to disable/enable line-buffering. I tried to go through the pull request but didn't completely follow what was added. Can anyone give me a run-down of if/how this was resolved? Currently, I've solved my issue by doing File::from_raw_fd(1) to get a non-line-buffered stdout stream, but this is platform dependent. Platform independence isn't a strict requirement of my project, so it's not the worst thing, but if there's now a way to disable line-buffering, I'd love to use a solution that doesn't depend on Unix conventions explicitly. Thanks! :)

Lucretiel commented 3 years ago

I've been working on resolving this over most of the summer. If you're referring to #72808, that PR is entirely preliminary; it refactors the design of LineWriter to allow for a future implementation of switchable buffering behavior.

Right now, there's no way to fully disable buffering on stdout. However, if you want to use block buffering, you can still wrap the Stdout or StdoutLock object in a BufWriter, which, when flushed, will send all the buffered data to the stdout device at once.

It's worth noting that, unless you are manually sending byte slices to stdout, you almost certainly don't want unbuffered stdout. print! and all the other formatted write utilities work by performing numerous tiny writes (several for each component of the formatted content); if these are performed directly on an I/O device, your performance will seriously suffer.

calebstewart commented 3 years ago

I appreciate your response. I assume this is a heavy lift, so not trying to be annoying or rush anyone, haha. What you explained makes sense, and I'll continue to track the progress moving forward. Thanks for all the hard work!

I am in fact in that very small edge case of writing byte slices to stdout, but I appreciate the heads up! It's a very good point and true in 99.99% of cases.

Lucretiel commented 3 years ago

Proposed implementation of switchable stdout buffering: https://github.com/rust-lang/rust/pull/78515. This should be the second-to-last PR towards the fulfillment of this issue; after that, it's only a matter of actually adding code to detect stdout's environment (tty or not) and correctly init the buffer mode.

the8472 commented 1 year ago

for reference, https://github.com/rust-lang/rust/pull/78515#issuecomment-1168362639 has the API team's last position on how this issue should be approached.

RalfJung commented 1 year ago

I'm a bit confused by the current status here. The issue is still open, and hence I assume it is unresolved. But we also have this comment in the standard library:

https://github.com/rust-lang/rust/blob/e7ef5d86197954d5676be97a8efc09a99ca423fe/library/std/src/io/buffered/linewritershim.rs#L10-L12

That to me sounds like there is some logic that would make "Stdout [...] be alternately in line-buffered or block-buffered mode". But there appears to be no such logic. Am I misunderstanding the comment?

the8472 commented 1 year ago

Lucretiel was refactoring the buffering logic with the goal to enable that. But the work stalled, see the comment linked above. So now a good chunk of the capability is in std but it's not exposed.

Lucretiel commented 1 year ago

Correct. LineWriterShim enables all of the necessary logic to allow a Bufwriter to temporarily opt-in to line buffering behavior (using the buffer it already has) Remaining work involves figuring out the exact shape that should take (my own proposal was an internal SwitchWriter type, which contains a bufwriter and a mode), along with what (if any) new public APIs should be added for controlling the mode.

RalfJung commented 1 year ago

Okay I see, thanks. The comment is misleading then since it strongly implies that Stdout does something that it doesn't do.

the8472 commented 11 months ago

@m-ou-se on from zulip:

i have an actual rust program running here that prints the energy consumption of my home every 10 seconds, which is tee'd into a log file and to a terminal. that program would work fine when run directly on the terminal, and suddenly seem to fully hang when piped through tee or similar.

Afaict such an invocation depends on unspecified behavior, it's not documented anywhere that stdout is line-buffered. The docs say it's buffered, but not which buffering strategy. And, as the opening comment says, people often are surprised that stdout is not performant, which is more subtle to notice than output not appearing.

Since the behavior can't be perfect for all uses and it's unspecified it's mostly a matter of

expectations
defaults
providing ways for the user to override if the defaults don't work for them

expectations

Here we can check prior art:

glibc's FILE* stdout is line-buffered for ttys, block- or unbuffered otherwise
python's stdout and stderr switch depending on isatty. But there's a global override and print also has an optional flush argument which makes it easy to get data out immediately when needed.
java's system.out is always line-buffered, FileDescriptor.out can be used to obtain unbuffered or block-buffered behavior.
Go's os.stdout is unbuffered since it's a simple file descriptor wrapper
nodejs is just wild
uutil's cat tries to write blocks to stdout but when it can't splice this still goes through linewriter which scans each written slice for newline bytes

defaults

Keep status quo, stdout is always line-buffered at process startup. If we decide that this is important enough to keep because people rely on it then perhaps we should guarantee it?
initialize Stdout based on IsTerminal
- as above, but add more heuristics such as seeing if a process has a controlling terminal (and thus a user that might be looking at things) or is part of a foreground process group
keep Stdout linebuffering but make StdoutLock block-buffered

Some other approaches of varying seriousness that also came to mind while writing this:

switch to a more selective line-buffering strategy based on tty-ness: only flush if the last byte of a write() or write_all() call is a \n. If the callers doesn't care about doing things line-by-line then we don't care either. This avoids splitting up binary data that happens to contain some 0x0A in random places.
never line-buffer stdout by default but make println! call flush() instead, writeln! could be the non-flushing alternative.
~~implement nagle's algorithm~~

overrides

This one is simple on the surface, we only need two configurables:

line-buffered on/off
buffer size (0 = unbuffered)

Though that'd still leave questions where to put those APIs in Stdout, LineWriter or BufWriter. Refer to https://github.com/rust-lang/rust/pull/78515#issuecomment-1168362639.

rust-lang / rust

io::Stdout should use block bufferring when appropriate #60673

expectations

defaults

overrides