scipy / scipy

SciPy library main repository
https://scipy.org
BSD 3-Clause "New" or "Revised" License
13.09k stars 5.19k forks source link

BUG: Weird __all__ manipulation prevents Pylance from working on Scipy modules #15801

Closed ldorigo closed 2 years ago

ldorigo commented 2 years ago

Describe your issue.

In some scipy modules (like scipy.stats), there's a weird line that modifies __all__:

https://github.com/scipy/scipy/blob/50e8aeacfa1e1b843100c7bdca7dfc338a8d43c4/scipy/stats/__init__.py#L473

This prevents Pylance (and probably others) from properly analyzing scipy modules, and thus doesn't offer autocomplete, intellisense and type checking.

I don't personally know why that line is even required. If __all__ is not set, Python will automatically omit _ prefixed variables when doing import *. __all__ should only be set if a different behavior is required. Perhaps this is some old cargo culting copy pasting.

Originally posted by @jakebailey in https://github.com/microsoft/pylance-release/issues/2038#issuecomment-981867152

According to pylance developers that line shouldn't be needed - is there any reason it's included? Could it be removed to improve experience for users on vscode?

Reproducing Code Example

from scipy.stats import norm

Error message

no error, but no intellisense, autocomplete etc. on affected modules

SciPy/NumPy/Python version information

1.8.0 1.22.2 sys.version_info(major=3, minor=10, micro=2, releaselevel='final', serial=0)

ldorigo commented 2 years ago

Git blame shows that those line were added by automatic 2to3 migration over 9 years ago:

https://github.com/scipy/scipy/blame/396a3592767c7477d14083053ac7b772951f125e/scipy/stats/__init__.py#L350

So it seems likely that it's just an artifact introduced by automatic migration tools?

tupui commented 2 years ago

Hi @ldorigo, thanks for reporting.

This line is not a remaining artefact. It's to prevent us from polluting namespaces. Only functions/classes which don't start with an underscore are inspected.

I don't think there is any issue with SciPy as PyCharm for instance has no problem offering any code completion, type checking, etc. Maybe you can describe a bit more the problem?

ldorigo commented 2 years ago

This line is not a remaining artefact. It's to prevent us from polluting namespaces.

Which namespace are you worried about polluting? AFAIK __all__ is only used when doing from xxx import *, and functions starting with an underscore are automatically ignored when doing from xxx import * - so I don't get what that line achieves.

Only files which don't start with an underscore are inspected.

Again unless I'm missing something, dir() is a list of symbols in the current namespace, not files?

I don't think there is any issue with SciPy as PyCharm for instance has no problem offering any code completion, type checking, etc. Maybe you can describe a bit more the problem?

The issue I linked to (https://github.com/microsoft/pylance-release/issues/2038) gives a bit more context - I know completions work with other language servers, but they don't with Pylance which is bundled with VSCode and is becoming increasingly popular - so it's a pity that Scipy doesn't work properly there.

tupui commented 2 years ago

Not files, I should have said functions/class. I corrected. Thank you for the link. As written in the issue I don't consider SciPy to have a bug here since other IDE don't have any issues with that. It's a valid thing to do even if this would be useless. Still, I will look if we can remove this. Maybe someone else knows?

ilayn commented 2 years ago

Not necessarily the namespace pollution, but enumerating items in SciPy is a huge undertaking given the number of functions and methods in it. I don't know how Pylance cannot or other langservers can however I don't understand why you are inspecting the imports anyways. This you can find out at runtime without code inspection since the exposed items are available at the import. Or I don't have enough experience why you are digging into dunders.

In the meantime, we have a policy about single underscore files being private and should not be used to import from directly such that when we shuffle them in the future no import statement would be affected. And to automatically collect all functions (again I emphasize huge in numbers) we *-import them and filter the ones out that are starting with underscores which are also many.

ilayn commented 2 years ago

The problem with the OP in your link is a different problem. We don't import all modules at the import of scipy and they have to be imported individually such that only relevant parts are loaded. In other words to be able to find norm

image

it has to be used as such. Otherwise no lang server can auto-complete because it is not imported yet to the namespace

ldorigo commented 2 years ago

I'm just a middle-man here - all I observe is that I can't do from scipy.stats import norm in vscode, which degrades user experience. I'm happy to talk more about this but if you find it important enough to address, maybe it's better that you just react directly to the issue I linked so you can communicate with Pylance devs? They're likely to have more relevant answers than me

tupui commented 2 years ago

@ldorigo the thing is that we are doing some lazy loading with modules, so this might be getting in the way.

ilayn commented 2 years ago

Ah lazy loading that's the expression yes 🤦 I'll write something in the linked issue @ldorigo

tupui commented 2 years ago

@ilayn Ok about the logic for __all__. I tried the following:

# toto/bobo/__init__.py

def _bobo_private():
    pass

def bobo_public():
    pass

__all__ = ['_bobo_private', 'bobo_public']

# toto/__init__.py
from .bobo import *

import toto;dir(toto) would have both public and private functions in it. Although doing from toto import * would not mean having the private function available.

Now, if I add in toto/__init__.py:

__all__ = [s for s in dir() if not s.startswith("_")]

nothing changes. It's not filtering out _bobo_private.

If the intent is to make sure we filter out all private from submodules that could have been declared in __all__ then I think instead we should do

__all__ = [s_mod for mod in dir() for s_mod in dir(mod) if not s_mod.startswith("_")]

Or, the current line is useful if we want to have __all__ defined. It would behave as if you would not have it if you do from xxx import *

ilayn commented 2 years ago

Yes probably you have #14360 in action.

tupui commented 2 years ago

Right then I believe we can close this.

ilayn commented 2 years ago

Fine by me if @ldorigo is OK with it.

ldorigo commented 2 years ago

Sure - it looks like recent pylance released at least partially fixed the problem, anyhow we can continue the discussion on that issue if necessary