`BufReader.lines` over `duct::cmd().reader()` never exits if process exited with non-0

hasezoey commented 1 year ago

I have noticed that with duct, if the output is read via a ReaderHandler and wrapped in a BufReader for the .lines() iterator, it never exits when the underlying process exits with a non-0 exit code

reproduction code:

let reader = duct::cmd("sh", ["-c", "exit 1"]).reader()?;

let stdout_reader = BufReader::new(reader);

for v in stdout_reader.lines() {
  println!("LINE {:#?}", v); // prints infinite "Line Err(Custom{..."command...exited with code 1"})"
}

// code below never gets run

whereas commands spawned with std::process::Command end the iterator

let reader = std::process::Command::new("sh")
            .args(["-c", "exit 1"])
            .stdout(Stdio::piped())
            .spawn()?;

let stdout_reader = BufReader::new(ytdl_child.stdout.expect("test stdout"));

for v in stdout_reader.lines() {
  println!("LINE {:#?}", v); // never gets run because of no output
}

// code gets run

duct 0.13.6 rust 1.70.0 Linux

oconnor663 commented 1 year ago

I would argue that this is mostly a feature :) As you've commented, the item type of the Lines iterator is Result<...>, and when one of those results is an Err, your program needs to respond to that error somehow. If you don't want to return immediately with ? or .unwrap(), then you should probably break out of your loop when you see errors.

I could see an argument that it might be nicer for ReaderHandle to return Err from read only once, and then to return Ok(0) after that. That would make your loop above terminate after printing one error. For the "command exited with error code" error, I think this would be pretty reasonable. (If a caller didn't care about that error the first time, they're clearly not going to care the second time.) However, that's not the only possible error path. These reads are ultimately coming out of a pipe, and a pipe read could potentially return any io::Error. Some errors are retryable (particularly ErrorKind::Interrupted), and returning a "fake" Ok(0) to a caller that's retrying after an interruption would be incorrect. I think if we pursued this, we'd wind up with an inconsistent contract. We'd promise that some errors will lead to Ok(0) on retry, but not all errors. Your example above would start working -- because the specific error you're hitting would definitely be one of the blessed ones -- but the code would arguably still be incorrect, because other more obscure error conditions could still turn it into an infinite loop.

Does that sound right?

hasezoey commented 1 year ago

then you should probably break out of your loop when you see errors.

that is basically what i am doing now (before it was just a lines().filter_map(|line| return line.ok()), but now it is checked inside the loop)

Does that sound right?

yes it makes it clearer now

Some errors are retryable (particularly ErrorKind::Interrupted), and returning a "fake" Ok(0) to a caller that's retrying after an interruption would be incorrect. I think if we pursued this, we'd wind up with an inconsistent contract

how does std::process::Command handle that then? (i am not familiar with either duct or the std's implementation)

i dont know for how duct will handle this issue, but i think other people will run into it so to have this issue is good for problem searching. i also reported this because the std Command implementation did not infinitely loop and though this was a problem

oconnor663 commented 1 year ago

how does std::process::Command handle that then?

The biggest difference is that std::process doesn't consider a non-zero exit status from a child to be an Err. Another difference is that std::process doesn't generally do automatic waiting/reaping. Duct does consider a non-zero exit status to be an Err by default, and it does do some automatic waits, particularly in this case when ReaderHandle reaches EOF (see the ReaderHandle docs).

For both of those reasons, iterating over stdout_reader.lines() in the std::process example above doesn't encounter any errors. It's possible that errors could occur there, if you found yourself in some situation where reading from the pipe was broken, but that's very unlikely in practice. Here's a contrived example of triggering such a situation. This program also prints errors in an infinite loop:

use std::io::prelude::*;
use std::io::BufReader;
use std::os::fd::AsRawFd;
use std::process::Stdio;

fn main() {
    let child = std::process::Command::new("sh")
        .args(["-c", "exit 1"])
        .stdout(Stdio::piped())
        .spawn()
        .unwrap();

    // Reach in and close the read end of the child's stdout pipe. This will cause all the reads in
    // the loop below to return errors.
    unsafe {
        libc::close(child.stdout.as_ref().unwrap().as_raw_fd());
    }

    let stdout_reader = BufReader::new(child.stdout.expect("test stdout"));

    for v in stdout_reader.lines() {
        println!("LINE {:#?}", v);
    }
}

oconnor663 / duct.rs

`BufReader.lines` over `duct::cmd().reader()` never exits if process exited with non-0 #112