Closed tkrabel closed 11 months ago
I should have written logic to handle this already, it may just be missing the regex rules
Here's what I found: https://github.com/python-rope/rope/commit/2182bfe06296a03b953f7cb245452d583cc57875 (utils.py)
I may have assumed all virtual environments began with .
. This is convention, not guaranteed. Thanks for catching that.
You may want to change this logic instead - IE add a check that name is not site-packages in the existing checking logic. This way its more consistent with the other filter mechanisms
Did we merge the database versioning update? I would purge and re-create all dbs after this change regardless of the approach
Honestly if I was to rewrite this again, I'd abstract the db, parsing and discovery (separate layers for package and module/file discovery) layers out of the main class - it's a bit of a monstrosity. They honestly can be made single threaded. There's also some poor logic in one of the cache generation functions I fixed in a PR I never merged.
There's also some poor logic in one of the cache generation functions I fixed in a PR I never merged.
So why not merge it then? :)
There's also some poor logic in one of the cache generation functions I fixed in a PR I never merged.
So why not merge it then? :)
I never finished the work, it was part of a much broader change that broke API compatibility. I don't see the problem in the current rope code, but my idea was to use the same function to generate the cache and have both cache generation functions call it instead of implementing the logic twice. Here's the PR: https://github.com/python-rope/rope/pull/485/ But it's not necessary, just an idea.
Does this handle cases where venv is in the project root? I am running into that on windows with the autodetect branch. Maybe this is a venv vs virtualenv thing? I think our implementation should be recursive (IE no Lib.site-packages.xyz modules) but I'm not sure how to implement that efficiently
Does this handle cases where venv is in the project root? I am running into that on windows with the autodetect branch. Maybe this is a venv vs virtualenv thing? I think our implementation should be recursive (IE no Lib.site-packages.xyz modules) but I'm not sure how to implement that efficiently
I think this PR doesn't change the way we treat virtual environments. The PR only doesn't treat site-packages
like a python package, which makes sense to me.
That said: @lieryan I saw you approved the PR. Can you merge this?
@bagel897 if you have any concerns related virtual environments, can we please address this in a separate issue/PR?
Autoimport scans for packages in the python path
In this example, we can see both ../.venv/lib64/python3.12/site-packages
and also the cwd ''
If we detect site-packages, that means a directory called site-packages in the search path - which means a failure to detect venvs. We shouldn't be searching the venv directory from the project root. The current method is to exclude directories prefixed with a ., which is insufficient.
Here's another failure example:
https://github.com/python-rope/rope/actions/runs/6739504275/job/18321101941?pr=516
Since it searches the venv, it indexes its contents twice - under the Lib
package. We need to exclude all instances of site-packages anywhere in the package or module path. Maybe the solution is to exclude it both at the package and module level?
@bagel897 @lieryan can we discuss this in a follow up issue and merge this PR? IIUC, my Pr provides an incremental improvement, while not causing any regressions, so I'd not want to delay this improvement unnecessarily.
Hi @tkrabel-db, thanks for creating this PR. I'm currently travelling on vacation, so my availability to look into rope issues and review PRs will be limited until December.
So don't worry, but I hadn't forgotten about this 🙂, but I won't be able to action the PR or make a new release until December.
@bagel897 coming back to your comment:
We need to exclude all instances of site-packages anywhere in the package or module path
I tested my code creating a virtual environment using venv
and I am fairly confident that my PR does exactly that. It may help looking at my problem statement again.
There, you can see we have duplications of the form (and only of the form)
Name(name=<import_name>, modname=<mod_name>, package=<package>, ...)
Name(name=<import_name>, modname="site-packages."<mod_name>, package='site-packages', ...)
Let's use an example to facilitate the conversation.
Name(name='BaseName', modname='jedi.api.classes', package='jedi', source=<Source.SITE_PACKAGE: 4>, name_type=<NameType.Class: 7>)
Name(name='BaseName', modname='site-packages.jedi.api.classes', package='site-packages', source=<Source.SITE_PACKAGE: 4>, name_type=<NameType.Class: 7>)
Here, you see that we attribute the BaseName
object to jedi
in the first entry - the correct match. In the second entry, however, we attribute the same object to a non-existing package we call site-packages
.
The change I propose (with your help of pointing me to the correct util; thank you!) captures all cases in which we attribute any object/module to be of root (i.e. the "package") site-packages
. This means it also captures the failure case you have eluded to earlier, namely ('from Lib.site-packages.pytoolconfig.sources import pytool', 'pytool')
.
Again, I tested my code and found that it also captures the case you presented.
If you think my PR still has some gaps, pls show them in a reproducible example based on this PR. And thank you for engaging in this PR! It's truly helpful having the original author around :)
Thanks for fixing this issue @tkrabel
Description
Remove
site-packages
from the python package search tree to avoid duplicate entries in the autoimport database. This also speeds up generating the modules cache by a factor of ~2.Fixes #722
Checklist (delete if not relevant):