rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.23k stars 12.71k forks source link

`std::fs::canonicalize` fails to resolve `/dev/stdin` connected to pipe #95239

Open crazystylus opened 2 years ago

crazystylus commented 2 years ago

When program tries to canonicalizes /dev/stdin it should resolve to tty, pipe or file as per configured redirection I tried this code and attached a pipe to stdin:

use std::fs;

fn main() {
    let resolved_path = fs::canonicalize("/dev/stdin").unwrap();
    println!("{}", resolved_path.into_os_string().into_string().unwrap());
}
echo " " | cargo run

I expected to see this happen: /dev/stdin canonicalizes to /proc/14063/fd/pipe:[127582]

Instead, this happened:

    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/canon-issue`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:4:56
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Meta

rustc --version --verbose:

rustc 1.59.0 (9d1b2106e 2022-02-23)
binary: rustc
commit-hash: 9d1b2106e23b1abd32fce1f17267604a5102f57a
commit-date: 2022-02-23
host: x86_64-unknown-linux-gnu
release: 1.59.0
LLVM version: 13.0.0
Backtrace

``` Finished dev [unoptimized + debuginfo] target(s) in 0.00s Running `target/debug/canon-issue` thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:4:56 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ```

Same thing in Python 3.9.7

import os
print(os.path.realpath("/dev/stdin"))
echo " " | python3 code.py

This is the result

/proc/16616/fd/pipe:[162072]
Urgau commented 2 years ago

I can reproduce the issue, however I'm not sure this is a bug. When I do echo " " | ls -lh --color /proc/self/fd/ I can see that 0 (stdin) is a symlink to pipe:[XXXXXXX] but this thing doesn't exist leading to a broken symlink leading itself to the not found error.

And according to the documentation of realpath(3) (the libc call done by std) the function should return ENOENT when The named file does not exist. which is exactly what's happening here. This is the expected behavior.

Python however doesn't use libc realpath but use a custom implementation that isn't "strict" by default leading it to return a completely broken path. I tried to activate the strictness path of the implementation but it doesn't seems possible from the API.

crazystylus commented 2 years ago

I can reproduce the issue, however I'm not sure this is a bug. When I do echo " " | ls -lh --color /proc/self/fd/ I can see that 0 (stdin) is a symlink to pipe:[XXXXXXX] but this thing doesn't exist leading to a broken symlink leading itself to the not found error.

And according to the documentation of realpath(3) (the libc call done by std) the function should return ENOENT when The named file does not exist. which is exactly what's happening here. This is the expected behavior.

Python however doesn't use libc realpath but use a custom implementation that isn't "strict" by default leading it to return a completely broken path. I tried to activate the strictness path of the implementation but it doesn't seems possible from the API.

Once the ls command stops executing, the pipe gets deleted and you won't be able to find it. If you ignore leave python aside, you can try the same in bash. This pipe will exists till readlink command finishes executing.

test@pop-os:~$ echo " " | readlink -f /dev/stdin
/proc/30560/fd/pipe:[289651]
Urgau commented 2 years ago

No, a broken symlink named pipe:[XXXXXXX] exist but because it's invalid (in this case broken) it result when trying to acces it in a not found error.

If you ignore leave python aside, you can try the same in bash. This pipe will exists till readlink command finishes executing.

The man page of readlink state that:

-f, --canonicalize
              canonicalize  by following every symlink in every component of the given name recursively; all but the
              last component must exist

meaning that it has a similar behavior to the python implementation in that it doesn't validate that pipe:[XXXXXXX] is valid contrary to libc::realpath function.

Urgau commented 2 years ago

The difference between libc::realpath (and thus Rust std) and python/realpath is that for the later the last component of the returned path maybe not be "valid"/"exist" contrary to the former how only return a path that can be fully resolved (idk if it's the terminology, sorry if it's not).

the8472 commented 2 years ago

The /proc/<pid>/fd/* symlinks are magical. Even if they don't resolve to a valid path you can still File::open them and then operate on that new file descriptor.