python / cpython

The Python programming language
https://www.python.org
Other
63.17k stars 30.25k forks source link

rlcompleter.Completer matches too much #66339

Open ec03ff70-882c-4dcd-b37d-4888ba54743b opened 10 years ago

ec03ff70-882c-4dcd-b37d-4888ba54743b commented 10 years ago
BPO 22141
Nosy @rbtcollins, @bitdancer, @PCManticore, @vadmium
Files
  • rlcompleter.diff: change regular expression to capture the entire attribute
  • rlcompleter_22141.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'library'] title = 'rlcompleter.Completer matches too much' updated_at = user = 'https://bugs.python.org/donlorenzo' ``` bugs.python.org fields: ```python activity = actor = 'martin.panter' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'donlorenzo' dependencies = [] files = ['36269', '37064'] hgrepos = [] issue_num = 22141 keywords = ['patch'] message_count = 10.0 messages = ['224848', '224853', '227581', '227603', '230208', '230267', '236418', '247502', '247601', '265469'] nosy_count = 5.0 nosy_names = ['rbcollins', 'donlorenzo', 'r.david.murray', 'Claudiu.Popa', 'martin.panter'] pr_nums = [] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue22141' versions = ['Python 2.7', 'Python 3.4', 'Python 3.5'] ```

    ec03ff70-882c-4dcd-b37d-4888ba54743b commented 10 years ago

    Example:

    >> completer = rlcompleter.Completer()

    >>> class A(object):
    ...     def foo(): pass
    ...     def foobar(): pass
    
    >>> completer.complete("A.foo(", 0)
    'A.foo('
    >>> completer.complete("A.foo(", 1)
    'A.foobar('

    I consider the last match a bug.

    The root of this bug is that in attr_matches the regular expression ignores any trailing non-alphanumeric characters by using the "\w" sequence. Note that it would also match "A.foo?%&@" to both "A.foo" and "A.foobar".

    I propose this regex instead: r"(\w+(\.\w+))\.([^.])" What do people think?

    ec03ff70-882c-4dcd-b37d-4888ba54743b commented 10 years ago

    attached the mini patch changing the regular expression in the proposed way

    5df057ac-c83d-447e-8c45-910556b17608 commented 10 years ago

    Looks good. Could you add a test that reproduces the intended behaviour?

    ec03ff70-882c-4dcd-b37d-4888ba54743b commented 10 years ago

    Oops!

    tests sound like a good Idea. I realized my "fix" doesn't work. I had not noticed this before because in my application I had already implemented a workaround :/

    The problem with catching the trailing parenthesis is that the group then does not match the attribute of the class. I'll be back with a new patch and test case.

    ec03ff70-882c-4dcd-b37d-4888ba54743b commented 9 years ago

    sorry for the delay but here is a new patch with a test.

    I changed the strategy for fixing the bug because the dealing with the opening parenthesis became to complicated.

    So, now I simply check whether the match actually startswith the search phrase before adding the match to the results. This is like a sanity check which fixes this bug.

    5df057ac-c83d-447e-8c45-910556b17608 commented 9 years ago

    Looks good to me. You might want to sign the contributor agreement: https://www.python.org/psf/contrib/contrib-form/. This is required for non-trivial contributions. You'll get a * next to your name after signing it.

    ec03ff70-882c-4dcd-b37d-4888ba54743b commented 9 years ago

    sorry it took so long but the contributor agreement is now signed.

    rbtcollins commented 9 years ago

    I'm struggling to understand this bug. I've tried idle and plain cPython and neither exhibit it. I suspect thats due to how readline is itself tokenizing things.

    Python 3.6.0a0 (default:ef5a2ba9df62, Jul 28 2015, 15:48:19) 
    [GCC 4.9.1] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> readline.parse_and_bind("tab: complete")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    NameError: name 'readline' is not defined
    >>> import readline
    >>> readline.parse_and_bind("tab: complete")
    >>> class A:
    ...  def foo(self):pass
    ...  def foobar(self):pass
    ... 
    >>> A.foo
    A.foo(     A.foobar(  
    >>> A.foo(      

    Whats probably not obvious there is that TAB at the ( indented a tab, rather than offering any completions.

    I'm hesitant to apply this without a deeper understanding of where its used, in case of side effects.

    bitdancer commented 9 years ago

    I agree with Robert. You'll have to convince me that this is an actual bug. It seems reasonable to me that foobar would be returned in this case...it seems to me it is analogous to what my zsh shell does when I hit tab and there's no exact match but there is a match if it interpolates. Since this doesn't evidence as a bug at the python prompt, there needs to be more of an argument as to why it is wrong...and probably too much risk of breaking working code to change it anyway, absent a bug at the python prompt.

    I'm inclined to reject this.

    vadmium commented 8 years ago

    Maybe there are two special cases colliding here:

    1. complete("A.foo", 0) returns "A.foo(". But the opening bracket is in readline.get_completer_delims(), which means the completer has suggested not only a “word” but also a delimiter.

    2. Perhaps Lorenz has taken "A.foo(" directly and passed it back to the completer. But as Robert said, if Readline were used, it would see the bracket as a delimiter, and only pass an empty string as the text.

    The question is how should the completer behave when it gets (invalid) input that would not be possible from default Readline usage?

    Lorenz: why are you passing "A.foo(" to the completer, when it only parses attribute names, and not other Python syntax such as brackets?