scop / bash-completion

Programmable completion functions for bash
GNU General Public License v2.0
2.91k stars 380 forks source link

bash completion mangles file names containing newline characters #704

Open idallen opened 2 years ago

idallen commented 2 years ago

Describe the bug

BASH completion cannot correctly handle file names containing newline characters. The file names are split apart at the newlines, making them useless.

To reproduce

        # First, get a shell with no completion loaded and show it working:     
        $ bash --norc
        bash-5.0$ ls -b
        one\ntwo\nthree\nfour\nfive\nsix
        bash-5.0$ xxx <TAB>'one
        two
        three
        four
        five
        six'
        bash-5.0$ touch foo
        bash-5.0$ xxx <TAB>
        foo                               one^Jtwo^Jthree^Jfour^Jfive^Jsix
        bash-5.0$ xxx o<TAB>'one
        two
        three
        four
        five
        six'

        # Now, load the completion scripts and watch it break:                  
        bash-5.0$ source /usr/share/bash-completion/bash_completion
        bash-5.0$ ls -b
        foo  one\ntwo\nthree\nfour\nfive\nsix
        bash-5.0$ xxx <TAB>
        five   foo    four   one    six    three  two
        bash-5.0$ xxx o<TAB>
        five   four   one    six    three  two
        # The last two completions are garbage.                                 
        # The file name is being split on newlines. 

Expected behavior

See above, before the loading of /usr/share/bash-completion/bash_completion

Versions (please complete the following information)

Additional context

I used the nonexistent command name xxx so as not to invoke any helper completion scripts.

Debug trace

See attached typescript.txt

calestyo commented 2 years ago

Just for the records:

I think doing this properly will be very difficult. I'd guess problems like that don't show up only for files containing newlines, but also for other characters (a filename in POSIX is rather binary and may contain everything except / and NUL).

Very problematic would also be filenames with trailing newlines, as these get stripped off in any command substitutions. All workarounds for that have their own extremely tricky problems.

calestyo commented 2 years ago

Actually, one part of a way to fix this, could be to "simply" use shell escaped strings for any completions that contain weird characters. Some tools, like e.g. GNU’s ls provide that out of the box (--quoting-style=) and there's also printf’s %q pattern.

E.g. having a dir with files like this:

# ls -1 | cat
a
b
c   d
f
g h
i
j
--foo
\n
𑙣𑙤𑙦

which are actually these:

./--foo
'./a'$'\n''b'
./𑙣𑙤𑙦
'./\n'
'./g h'
'./c'$'\t''d'
'./e'$'\r''f'
'./i'$'\n''j'

can be easily quoted from a shell script like with find . -exec printf '%q\n' {} \;

Problem with this shell quoting is also, that it uses '....' which may not be what one wants for bash completion, as e.g. spaces still remain.
So perhaps one would need a different quoting mechanism, one that converts to \uUUUU escapes or so. Maybe even one that makes it configurable, what is actually escaped (e.g. I couldn't manually enter 𑙣𑙤𑙦) on my keyboard (but I can enter ä and friends), so I'd want a mode that escapes just those, which I cannot type.

Another problem is of course that this might require (many...all) completions to be adapted (the escaping would need to happen before anything like command substitution in a completion script messes things up).

akinomyoga commented 2 years ago

The problem of newlines is related to the design of the Bash interface for the completions, (i.e., that of compgen), so there is actually nothing we can do to solve the problem unless we give up using compgen and re-implement everything without relying on the features that Bash provides.

I don't think there are essential problems treating the filenames containing control characters (except for the newline i.e. LF) as far as we carefully implement the completions by properly setting IFS. If anything is broken, I guess we can just fix it by properly quoting words and setting IFS. Also, we cannot rely on ls -Q and find . -exec printf '%q\n' {} \; which are not POSIX (but couldn't we just use builtin printf '%q\n' *?).

akinomyoga commented 1 year ago

The next version of Bash has comgpen -V array_name.

https://lists.gnu.org/archive/html/bug-bash/2023-04/msg00035.html https://git.savannah.gnu.org/cgit/bash.git/commit/?h=devel&id=a46164736e59066f767135b0b25eec73acbe98d8