Wordbreak issue when completing before end of line

mattclay commented 3 years ago

repro.py

#!/usr/bin/env python
# PYTHON_ARGCOMPLETE_OK

import argparse
import argcomplete

parser = argparse.ArgumentParser()
parser.add_argument('--pass', choices=('name_one', 'name_two'))
parser.add_argument('--fail', choices=('name:one', 'name:two'))
parser.add_argument('--other')
argcomplete.autocomplete(parser)

The following completion in the middle of the line results in faulty completion:

$ ./repro.py --fail name:<TAB> --other

It becomes this:

$ ./repro.py --fail name:name: --other

Instead I'd expect to need to double-tab for this output:

$ ./repro.py --fail name:<TAB><TAB> --other
one  two

For comparison, it works fine at the end of the line:

$ ./repro.py --fail name:<TAB><TAB>
one  two

If an underscore is used instead of a colon, it works in the middle of the line or at the end:

$ ./repro.py --pass name_<TAB><TAB>
name_one  name_two  
$ ./repro.py --pass name_<TAB><TAB> --other
name_one  name_two

This appears to be due to the lexer clearing last_wordbreak_pos here: https://github.com/kislyuk/argcomplete/blob/5a20d6165fbb4d4d58559378919b05964870cc16/argcomplete/my_shlex.py#L296-L297

Commenting out those two lines resolved the issue.

kislyuk commented 3 years ago

Thanks for filing this, I can confirm that it is an issue. @evanunderscore this code was introduced in #197 and it will take me some time to page in all the context for how this works - but would be a cool puzzle to solve.

kislyuk commented 2 years ago

The easiest workaround that I have found is to just truncate everything after the cursor. This avoids shlex "reading ahead" of the cursor and resetting the "wordbreak" state too early (which interferes with the truncation of completions necessary to accommodate bash handling of characters like :).

That's a big change and I want to make sure it's solid, so I won't merge that PR until I have some unit/regression tests for it.

mattclay commented 2 years ago

@kislyuk I confirmed the truncation fixes the issue.

I also noticed another scenario which completes incorrectly, even with the truncation:

$ ./repro.py --fail na<TAB>me: --other

Which becomes:

$ ./repro.py --fail name:me: --other

I'd expect it to remain unchanged:

$ ./repro.py --fail name: --other

That may be a separate issue from this one. Thoughts?

evanunderscore commented 2 years ago

@kislyuk Thanks for looking into this. The logic behind #352 seems sound, and if the test suite passes it means it still handles every case that I could think of when I was making those changes. It looks like we currently bail out once we pass the cursor anyway, so I don't see what this would break.

@mattclay Regarding your second scenario, that looks completely normal to me and is exactly how I would expect it to work. For example:

$ ls /et<tab>c/
$ ls /etc/c/

kislyuk commented 2 years ago

Thanks for confirming @evanunderscore. I've been thinking about this and while I think we should proceed with #352 in the short term, I also hope we can enable a future where arguments after the cursor can affect parser state. This can be useful with mutually exclusive arguments, etc.

I'll merge #352 and then try to scope out a design for what would be necessary to enable that, in a separate PR/issue.

kislyuk commented 2 years ago

@mattclay thanks for the latest scenario, you're right that it's a separate issue. Naively, it can be fixed if we can infer with confidence that we're in the middle of a word (the char under the cursor is alphanumeric, not punctuation, space or EOL), and emit no completions in that case. This may not work with quotes and escapes, but we can probably do it on a "best effort" basis regardless.

evanunderscore commented 2 years ago

I personally think we should be doing our best to conform to standard completion behavior, which the example given already does. What the shell does with those completions is not really our business.

@mattclay Try this:

$ bind "set skip-completed-text on"
$ ./repro.py --fail na<tab>me: --other
$ ./repro.py --fail name:<cursor> --other

kislyuk commented 2 years ago

great point @evanunderscore, let's leave that behavior as is.

mattclay commented 2 years ago

@evanunderscore Thanks for the example and link.

It looks like argcomplete handling of : is different than standard completion behavior. For example, in a directory which contains the files name:one and name:two, I see a difference between repro.py and ls:

ls <tab> becomes ls name\:
../repro.py --fail <tab> becomes ../repro.py --fail name:

Editing a line also behaves differently:

ls name:<tab> name:two offers no completion
ls name\:<tab> name:two offers name:one and name:two for completion
../repro.py --fail name:<tab> --other becomes ../repro.py --fail name:name: --other
../repro.py --fail name\:<tab> --other becomes ../repro.py --fail name: --other

So ls is able to handle the situation correctly even without skip-completed-text.

Shouldn't those last two repro.py examples behave the same as the ls examples?

evanunderscore commented 2 years ago

My above example was with respect to the change in #352.

The only difference in your first example is that we are not escaping the colon. If we wanted to, we could add all of COMP_WORDBREAKS here. We actually were doing that before I changed it; I'm not opposed to changing it back.

ls name:<tab> name:two offers no completion

COMP_WORDBREAKS includes a colon by default, which means completion considers name to be one complete token, and now attempts to complete a new one. This is so that when you are constructing e.g. a PATH variable which has multiple paths delimited by a colon, you can tab complete multiple paths within one otherwise unbroken string.

ls name\:<tab> name:two offers name:one and name:two for completion

The backslash escape here is escaping the colon not for the shell, but for completion. Completion now does not break on the colon and suggests filenames beginning with name:.

../repro.py --fail name:<tab> --other becomes ../repro.py --fail name:name: --other

The doubling is fixed in #352; this does not happen any more.

../repro.py --fail name\:<tab> --other becomes ../repro.py --fail name: --other

This is "correcting" the escaped character since argcomplete specifically works around completions containing wordbreak chars. This is perhaps not the best way to do it, but it seems to be a feature people want. Perhaps it could be implemented more cleanly using something like this.

So ls is able to handle the situation correctly even without skip-completed-text.

These examples have nothing to do with skip-completed-text. That option specifically relates to this example:

$ ls na<tab>me:
$ ls name\:<cursor>me:
$ bind "set skip-completed-text on"
$ ls na<tab>me:
$ ls name\:<cursor>:

Which ls does not handle the way you are expecting without skip-completed-text.

(As an aside, this is default completion and doesn't have anything specifically to do with ls.)

We could of course remove the special handling for wordbreak characters (example given from master):

diff --git a/argcomplete/__init__.py b/argcomplete/__init__.py
index c7e986a..de002c6 100644
--- a/argcomplete/__init__.py
+++ b/argcomplete/__init__.py
@@ -560,14 +560,13 @@ class CompletionFinder(object):
         This method is exposed for overriding in subclasses; there is no need to use it directly.
         """
         special_chars = "\\"
+        special_chars += os.environ["_ARGCOMPLETE_COMP_WORDBREAKS"]
         # If the word under the cursor was quoted, escape the quote char.
         # Otherwise, escape all special characters and specially handle all COMP_WORDBREAKS chars.
         if cword_prequote == "":
             # Bash mangles completions which contain characters in COMP_WORDBREAKS.
             # This workaround has the same effect as __ltrim_colon_completions in bash_completion
             # (extended to characters other than the colon).
-            if last_wordbreak_pos:
-                completions = [c[last_wordbreak_pos + 1:] for c in completions]
             special_chars += "();<>|&!`$* \t\n\"'"
         elif cword_prequote == '"':
             special_chars += '"`$!'

$ ./repro.py --fail <tab>
$ ./repro.py --fail name\:<tab>
name\:one  name\:two

This is now working exactly as correctly as ls, including the following probably undesirable behavior:

$ ./repro.py --fail name:<tab>
$ ./repro.py --fail name:name\:<tab>
name:one  name:two

Here, name is treated as a complete token, and so we begin completing a new one, exactly as ls would. However for this example program this is clearly undesirable, hence why we have the workaround in the first place.

(Something I don't understand in that last example: why do the displayed completions contain the backslash in the first case but not the second?)

kislyuk / argcomplete

Wordbreak issue when completing before end of line #351