airspeed-velocity / asv

Airspeed Velocity: A simple Python benchmarking tool with web-based reporting
https://asv.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
866 stars 181 forks source link

BUG: Remove prepended entry from sys.path #1346

Closed ngnpope closed 8 months ago

ngnpope commented 12 months ago

Python prepends the directory of the script being executed to sys.path and this can cause other packages to be shadowed, e.g. the builtin statistics module being shadowed by asv.statistics because the path of the asv package is prepended when the benchmark runner script is executed. There was a prior fix for this but it was removed as part of refactoring and splitting into asv_runner.

Python 3.11+ has a mechanism to avoid this via use of -P on the command line or setting the PYTHONSAFEPATH environment variable. This doesn't really help though as support for older versions of Python is required. This patch does, however, check sys.flags.safe_path to avoid fixing up the path if this feature is being used.

This further highlights that the original approach is not particularly robust as it removed the first item unconditionally and sometimes tried to add it back in. Instead this patch checks for and removes the path only if it matches the directory that the runner script resides in.

Other alternatives could be to consider running with the -I command line flag to force isolated mode (Python 3.4+), although that might cause other unexpected breakage as it implies -E (which ignores all PYTHON* environment variables), etc.

See the following links for details:

Regression in 5ab2d29d039a080c3a3f48ecac65526842faa7f1.

ngnpope commented 12 months ago

The specific areas in the commit that removed this in error are here and here.

ngnpope commented 11 months ago

@HaoZeke Would it be possible to release v0.6.2 with this bug fix?

ngnpope commented 11 months ago

I don't think so. And it's not just asv.statistics -- that's just the one that first became an issue for me -- it's any module/package under asv that could be used as a top-level module/package in someone's project.

HaoZeke commented 11 months ago

I don't think so. And it's not just asv.statistics -- that's just the one that first became an issue for me -- it's any module/package under asv that could be used as a top-level module/package in someone's project.

It would still sound like a problem of (possible) relative paths? If all the paths under asv were explicit and with a clear API (e.g. from asv._bar import foo instead of from . import foo I think path manipulation should not be required. I will wait for some more inputs, perhaps from @mattip or others who may have taken part in the earlier design decision (@mdboom @pv @datapythonista @fangchenli ?)

ngnpope commented 11 months ago

It has nothing to do with relative paths or private packages. It's because the benchmark runner script is executed directly with something like python …/asv/benchmark.py … and Python automatically prepends the the directory the script is in to sys.path such that that directory is treated as a top-level location from which to import from.

Python 3.11+ supports disabling this behaviour as mentioned above, and it's sort of possible with other existing command line flags in earlier versions, but with extra restrictions that may be undesirable. So mangling the path is, unfortunately, the most compatible option.

Having said that, another potential solution I can think of is to move the benchmark script from …/asv/benchmark.py to …/asv_benchmark.py so it gets installed directly in site packages… That would likely still prepend tosys.path, but at least it'd be a sensible location 🤔

jorisvandenbossche commented 9 months ago

I am also running into this issue with the benchmarks of geopandas (where we import the stdlib statistics module). I agree with @ngnpope's analysis that's it's potentially any top-level file or module in asv that can shadow any other import, just because the directory of the installed asv package is added to the sys.path.

I don't think there is any good reason why you would want to have the asv package directory to be added to sys.path, so therefore I think this PR is a sensible solution (and also just essentially restoring what was done in older versions of asv, which has been working fine for years).

HaoZeke commented 8 months ago

Thanks for the fix @ngnpope. Going over this again I think it makes sense to restore the older behavior.