Limit the completion to just those variables of the appropriate enum types.

hongyi-zhao commented 2 years ago

On Ubuntu 20.04.3 LTS, I'm using self-compiled git master version of Emacs with python-lsp-server. See the following code completion results given by python-lsp-server:

As you can see, company gives so many undesired/incorrect completion candidates in the above screenshot. According the docstring here, only the following flags are allowed:

    A  ASCII       For string patterns, make \w, \W, \b, \B, \d, \D
                   match the corresponding ASCII character categories
                   (rather than the whole Unicode categories, which is the
                   default).
                   For bytes patterns, this flag is the only available
                   behaviour and needn't be specified.
    I  IGNORECASE  Perform case-insensitive matching.
    L  LOCALE      Make \w, \W, \b, \B, dependent on the current locale.
    M  MULTILINE   "^" matches the beginning of lines (after a newline)
                   as well as the string.
                   "$" matches the end of lines (before a newline) as well
                   as the end of the string.
    S  DOTALL      "." matches any character at all, including the newline.
    X  VERBOSE     Ignore whitespace and comments for nicer looking RE's.
    U  UNICODE     For compatibility only. Ignored for string patterns (it
                   is the default), and forbidden for bytes patterns.

So, I want to know the suggested company-backends setting to obtain desirable completion candidates when using python-lsp-server. Also see the discussion here for relevant discussion.

Regards, HZ

lieryan commented 2 years ago

This is nothing to do with completion settings, but all to do with having the right type hints that tells the completer what values re.compile(..., flags) takes.

In this case flags are enums and in theory the re module could add a type hint indicating which enum class is valid values for the flags parameter and the completer could use that info to limit the completion to just those variables of the appropriate enum types.

python-lsp-server can use either jedi or rope library for completion; but by default it uses jedi. I don't think hinting enums in this way is currently supported in rope (I'm a rope maintainer). You may want to look into jedi documentation and ask them if they support this kind of completion.

PeterJCLaw commented 2 years ago

This is theoretically possible (jedi provides the raw type information from the typeshed), and I can see why you're asking this, however the resulting behaviour is highly unlikely to actually be what you want. Consider the following use-case, which what you're asking for would make harder to spell:

SETTINGS = re.IGNORECASE | re.MULTILINE
...
re.match('foo', 'bar', SETTINGS)
     # completion here ^

If in the completion of the last parameter here only re.XYZ completions were offered, then SETTINGS would be missing (and surprisingly so). While it's possible to augment the result to detect and allow constants, since an item of almost any type can be reached through member access of any object, the valid (and thus potentially useful) chain of symbol completions at any point is almost always all valid symbols in scope.

As a result the augmentations needed to allow things which might be valid would involve checking all possible member access, which is likely to be highly non-performant and yet not likely to reduce the candidate pool of suggestions meaningfully.

The alternative might be that we only apply this limiting for members of re. However the question is then whether from re import IGNORECASE style symbol access should be filtered? An approach of this sort (limited base on the symbol source) would also require that the implementation be filled with special cases for each callsite which needs this behaviour, which would become a maintenance burden pretty quickly and (as noted) offers only marginal value.

There definitely are places where a more limited pool of suggestions is practical, these are relatively few and far between.

Footnote: a case where a limited pool is practical

Consider the literal keys of a `TypedDict`, which must be literals and must come from the valid keys of the `TypedDict` and thus sidestep the above issues, though note that Jedi has thus far not implemented this on the grounds of not being worth the (non-trivial) effort of doing so.

krassowski commented 2 years ago

Thanks for chiming in! Very good point on that reliable filtering is likely out of scope for both jedi and pylsp for reasons outlined above; however, what might be more useful would be trying to improve sorting to place the known good matches at the top. It is a non-trivial problem, and frankly easier to solve with frequency analysis of existing code (currently sold as ML), but I think we should be open to explore our options here.

PeterJCLaw commented 2 years ago

Indeed, using this sort of information to improve sorting strikes me as a much more fruitful direction :+1:. (There remain challenges around avoiding lots of special casing though.)

hongyi-zhao commented 2 years ago

Another relevant problem is the incorrect docstring retured by company-posframe.

python-lsp / python-lsp-server

Limit the completion to just those variables of the appropriate enum types. #101