uutils / findutils

Rust implementation of findutils
MIT License
280 stars 35 forks source link

xargs: stdin should not be merged before replacing #362

Closed YDX-2147483647 closed 1 week ago

YDX-2147483647 commented 2 months ago

Given the following stdin,

abc
def
g

xargs -i echo [{}] should give

[abc]
[def]
[g]

instead of [abc def g].

Test using python

(Ubuntu)

>>> from subprocess import run

# xargs 0.5.0 from uutils
>>> run(['./xargs', '-i', 'echo', '[{}]'], capture_output=True, input=b"abc\ndef\ng").stdout
b'[abc def g]\n'

# xargs 4.8.0 from GNU
>>> run(['/usr/bin/xargs', '-i', 'echo', '[{}]'], capture_output=True, input=b"abc\ndef\ng").stdout
b'[abc]\n[def]\n[g]\n'
Test using shell

(MSYS2 on Windows)

# xargs 0.5.0 from uutils
$ printf 'abc\ndef\ng' | ./xargs -i echo [{}]
[abc def g]

# xargs 4.9.0 from GNU
$ printf 'abc\ndef\ng' | xargs -i echo [{}]
[abc]
[def]
[g]

Platform: unknown-linux-gnu, pc-windows-msvc, other not tested.

Relavant codes

https://github.com/uutils/findutils/blob/baa09baf47992a074538ed0e4bb3c4e952f37111/src/xargs/mod.rs#L407-L435

There's no test on this behaviour.

https://github.com/uutils/findutils/blob/baa09baf47992a074538ed0e4bb3c4e952f37111/tests/xargs_tests.rs#L429-L430

Relates-to: #323

sylvestre commented 2 months ago

I am curious why you are sharing a python example ? :)

would you like to try to provide a fix? thanks

YDX-2147483647 commented 2 months ago

why you are sharing a python example

I thought it was caused by CR/LF or shells, so I want to feed bytes to the executable directly. AFAIK, python is the easiest way.

would you like to try to provide a fix?

I'd like to do so! But I cannot guarantee when, maybe in a month?

YDX-2147483647 commented 2 months ago

I've inspected it deeper. The issue is harder than I thought. The command is always executed once in current implementation, but it needs to be executed multiple times if -i/--replace or -I is given.

Example:

# xargs (GNU findutils) 4.8.0
$ printf "abc\ndef\ng" | xargs python3 -c 'from sys import argv; print(argv)'
['-c', 'abc', 'def', 'g']
$ printf "abc\ndef\ng" | xargs -i python3 -c 'from sys import argv; print(argv, repr("""{}"""))'
['-c'] 'abc'
['-c'] 'def'
['-c'] 'g'

The following is to be appended to tests/xargs_tests.rs.

    // Multiple lines
    Command::cargo_bin("xargs")
        .expect("found binary")
        .args(["-i=_", "echo", "[_]"])
        .write_stdin("abc\ndef\ng")
        .assert()
        .success()
        .stderr(predicate::str::is_empty())
        .stdout(predicate::str::diff("[abc]\n[def]\n[g]\n"));

    Command::cargo_bin("xargs")
        .expect("found binary")
        .args(["-i", "echo", "{} {} foo"])
        .write_stdin("bar\nbaz")
        .assert()
        .stdout(predicate::str::diff("bar bar foo\nbaz baz foo"));
YDX-2147483647 commented 2 months ago

Workaround

Pass -n 1 or --max-args 1. (and -d \n if you want)

$ printf 'abc\ndef\ng' | xargs -I _ -n 1 echo [_]
[abc]
[def]
[g]

Conflicting xargs options (GNU Findutils 4.9.0):

The options ‘--max-lines’ (‘-L’, ‘-l’), ‘--replace’ (‘-I’, ‘-i’) and ‘--max-args’ (‘-n’) are mutually exclusive.

The exception to this rule is that the special max-args value 1 is ignored after the ‘--replace’ option and its short-option aliases ‘-I’ and ‘-i’, because it would not actually conflict.

YDX-2147483647 commented 2 months ago

Hi! I've fixed the issue in normal cases, but if conflict options are given, things get complicated. (I need to change 141 lines)

Currently, we throw a warning and take the last option.

https://github.com/uutils/findutils/blob/baa09baf47992a074538ed0e4bb3c4e952f37111/src/xargs/mod.rs#L930-L933

Should I follow this behaviour? Or may I use Arg::conflicts_with in clap and throw an error?