fpgmaas / deptry

Find unused, missing and transitive dependencies in a Python project.
https://deptry.com/
MIT License
833 stars 18 forks source link

deptry does not work when installed globally #92

Open fpgmaas opened 2 years ago

fpgmaas commented 2 years ago

Describe the bug

Whenever deptry is installed globally, it does not have access to the metadata of the packages in the virtual environment, even if that virtual environment is activated.

I will see if I can either

To Reproduce

install globally with pip install deptry outside of the virtualenv. Then activate a virtualenv and run deptry .

fpgmaas commented 2 years ago

Added a warning for now: https://github.com/fpgmaas/deptry/pull/93

lisphilar commented 2 years ago

Just to confirm, is it impossible to perform static code analysis with pyproject.toml, poetry.lock and .venv directory?

fpgmaas commented 2 years ago

No, it's definitely possible to scan a project with pyproject.toml, poetry.lock and .venv. when added to the project with poetry add --dev deptry.

But it will not work to install it globally with pip install deptry and then scanning a poetry project. It really needs to be within the virtual environment. (So your project covid19-sir is not affected).

lisphilar commented 2 years ago

Yes, this issue is regarding global installation of deptry. I just thought, but "script" of the target pyproject.toml can be read from the outside of the virtual environment.

fpgmaas commented 2 years ago

On Reddit, someone offered this as a potential starting point for adding the functionality; https://stackoverflow.com/a/14792407

Not sure if we should add this kind of solution to the codebase though.

fpgmaas commented 1 year ago

@mkniewallner suggested using site.getsitepackages(), which contains all installed modules and should be available when the venv is active. A good source of inspiration for that may be mypy which has similar needs, see here and here

fpgmaas commented 1 year ago

Tried this out, but also unsuccessful. site.getsitepackages() does not seem to return the virtual environment's site-packages directory.

To reproduce:

In my case, this returns:

['/Users/florian.maas/.pyenv/versions/3.9.11/lib/python3.9/site-packages']

And a list of warnings since deptry could not find the installed packages.

However, when running the following steps:

The output is:

['/Users/florian.maas/git/my-project/.venv/lib/python3.9/site-packages']

So then it does find the correct site-packages directory.

kwentine commented 1 year ago

Hi πŸ‘‹πŸ» I have a very naΓ―ve question: from the site.getsitepackages() strategy you tried, I assume that getting the path of the active virtualenv would suffice, even if deptry is not run by the virtualenv interpreter. Could this path not be retrieved using the VIRTUAL_ENV environment variable exported by the activation script ?

If I'm completely off-topic (which I fear πŸ˜… ) I'll be glad to have some pointers to the codebase that might help me understand the problem better!

fpgmaas commented 1 year ago

@kwentine Thanks for the suggestion. That's no naive question, don't be afraid to ask! I am not an expert at this subject myself either.

The issue lies in this part of the code. Here, we try to get the metadata of a package using importlib-metadata, for which I believe it is necessary that the path to the virtual environment is in sys.path.

Your idea of using VIRTUAL_ENV seems pretty good. However, this points to <some_path>/example-project/.venv, whereas the packages are actually stored in <some_path>/example-project/.venv/lib/python3.10/site-packages. We could try to build a solution around this that looks for a site-packages directory recursively within VIRTUAL_ENV.

An issue I can think of with this solution; how do we detect if it's necessary to perform this recursive search?

kwentine commented 1 year ago

The issue lies in this part of the code.

@fpgmaas thanks for encouragements and this enlightening entry point πŸ™‚ I'd like to share an idea based on importlib.metadata's suggested extension mechanism.

First, suppose we have a way of reliably detecting if deptry is currently running in a virtualenv.

def running_in_virtualenv() -> bool:
  # See https://docs.python.org/3/library/sys.html?highlight=sys#sys.base_prefix for this strategy
  return sys.prefix != sys.base_prefix

Then, suppose we have a few heuristics to guess a project's virtualenv site-packages on the filesystem:

def find_virtualenv_site_packages() -> Path | None:
    project_dir: Path = current_project_dir()
    site_packages = None
    possible_roots = [
       os.environ.get("VIRTUAL_ENV"),
       project_dir / ".venv",
       Path("~/.virtualenvs") / project_dir.name,
  ]
  while not site_packages and possible_roots:
      site_packages = find_site_packages_below(possible_roots.pop())
  return site_packages

Then we could implement and install a sys.meta_path finder along the lines of:

from importlib.metadata import DistributionFinder

class VirtualenvDistributionFinder(DistributionFinder):
    @classmethod
    def find_distributions(cls, context):
        if not running_in_virtualenv():
            site_packages = find_virtualenv_site_packages()
            if site_packages:
                path = [site_packages, *sys.path]
                context = DistributionFinder.Context(name=context.name, path=path)
        return super().find_distributions(context)

Let me know if I need to make the idea clearer. If you think this might be a way to go, I'll work on a PR πŸ™‚

kwentine commented 1 year ago

Well I realize that implementation would be highly inefficient since it would call find_virtualenv_site_packages every time package metadata is looked up. So let's say "a less clumsy variation of the above":

if not running_in_virtualenv():
    site_packages = find_virtualenv_site_packages(project_dir) 
    sys.meta_path.insert(0, VritualenvDistributionFinder(site_packages=site_packages))
fpgmaas commented 1 year ago

I think this is the most promising and detailed starting point until now, better than what I could think of πŸ˜„ So if you think it's worth a shot, I look forward to reviewing the PR that implements this.

edgarrmondragon commented 9 months ago

Late to the party, but it's probably worth taking a look at how pipdeptree added support for arbitrary virtualenvs: https://github.com/tox-dev/pipdeptree/blob/28bf158e98e95109a426aad8a0ac3b1ea2044d4a/src/pipdeptree/_non_host.py#L16

md384 commented 4 months ago

By setting the PYTHONPATH to the site-packages within the external virtualenv, for example PYTHONPATH=/PATH_TO_VENV/.venv/lib/python3.11/site-packages deptry . will work (at least with python3.11).

Seems like https://importlib-metadata.readthedocs.io/en/latest/api.html#importlib_metadata.DistributionFinder might be the way to go for implementing in deptry (you can see in the docs that the path defaults to sys.path which I think is why the above works).