Open calizarr opened 3 months ago
After a chat with with @calizarr here is the suggestion of a way to go:
I want neither https://github.com/aboutcode-org/python-inspector/pull/170 from @heliocastro that requires installing pip nor https://github.com/aboutcode-org/python-inspector/pull/181 from @calizarr that requires installaing pip and further spawns a subprocess
Instead the approach could be:
add environment variable to the command line such that it is easy to configure index URLs without changing how python-inspector is called (which helps ORT), using https://click.palletsprojects.com/en/8.1.x/arguments/#environment-variables . Note that we only support --index-url
for now and not --extra-index-url
and it needs some testing to see how we would handle multiple URLs with a single environment variable. The var could be named "PYTHON_INSPECTOR_INDEX_URL" or the shorter "PYINSP_INDEX_URL" already used in the test env vars.
we could use the pip.conf or pip.ini configuration files. For this we would need to: 1. add CLI options to either use the default locations from pip as detailed in https://pip.pypa.io/en/stable/topics/configuration/#location or use a specific pip.conf file as an option. we will need to be clear about precedence of the options and the environment variable above. To read the config files, we can use the approach of @heliocastro from https://github.com/aboutcode-org/python-inspector/pull/170 but not using pip as a library. Instead we could vendor/extract the few files we need, ideally keeping keep history with git-filter-repo. The files would be:
pip._internal.utils.appdirs
, pip._internal.utils.compat.WINDOWS
(one liner to copy), pip._internal.utils.misc.ensure_dir
and , pip._internal.utils.misc.enum
HI guys
At this draft pr, I started to try to solve this one. Please feel free to go over it and tell me if it is somehow in the right direction.
I migrated initial global settings to a global pydantic_settings object, which allows to expects variables with PYTHON_INSPECTOR prefix. Example:
PYTHON_INSPECTOR_INDEX_URL = <something>
I choose PYTHON_INSPECTOR because the clarity.
This move allows to have a .env
local file with the variables, have it as an environment variable, and or use the usual command line.
To match the current pip spec, I changed a little the behavior on how index_url was treated. By default, the pip.conf can contains only one single url, and was always overrides by the default one, others are summarily ignored. So now, only the defined index_url is considered, if none is defined, pypi one is the default.
The additional EXTRA_INDEX_URLS and respective command is now the multiple option where any additional indexes could be added.
@pombredanne Little comment, i was able to pass the tests only with regen.
Btw, if this approach is ok, I will add the parsing from pip.conf to fit this model as the one of the resources in chain. Would then be something like this order of priority overrides top to bottom:
.env
fileIf none of the options are set, the pypi will always be the main.
I've made this PR to add environment variables using Click's built-in mechanism for the index-url flag.
https://github.com/aboutcode-org/python-inspector/pull/187
This doesn't solve everything but it's a relatively easy built-in start.
That approach works too, but does not tackle all the entries where index is used, and it will cause another extra point to look at in the entire codebase. What i'm tried above, besides be more complex, is simply reduce the variable usage for not only the inspector indec, but for all code, in one single place.
Oh I understand, however, given that Click is being used for CLI commands, I think it's best to start there and then go from your draft PR.
I still think we need to have an option to use pip.conf and not use the PyPi simple index. My PR is just a start.
I don't see a reason to not take advantage of Click's features.
Agreed with that, but again, solving a single point, and again, what i want to avoid is exactly this, we solve one small place, then "later" we will continue. "Later" never comes, someone else will have a similar issue in different place of code, different developer comes and solve in his way, and again, we end up with code that not easy to understand, not consistent.
What i want is try to solve it right, starting to cleanup the code, not just add another plaster over the code that will not be easy to understand for anyone beyond this post.
In other way, your idea of have option to ignore pip.conf is good, and i thinking in doing the same for .python-version files.
But exemplifying on how a global settings object make sense, if we set this disabling options in the settings object, it will be easily accessible through the code, and not need to be passing click contexts or even work, global imported variables always hard to track if something is not working.
We can certainly attempt to make sure that later is now.
One can still gather all the settings in one place and then pass them along. In this case, Click is still handling flags and env vars from the CLI. However, the settings object can still import those and then use them along the way. They are not mutually exclusive.
This isn't quite another plaster over the code rather than adding one extra feature of Click's built-in functionality. If reading the code for resolve_cli.py then an understanding of Click and/or reading their documentation would be the best way to understand how the CLI flags are working and by extension the envvars.
The ability to ignore configuration files (as it does now) is good and should remain the default as that is what is expected at this point. However, a flag/envvar to enable configuration file parsing in addition to what is in your PR makes total sense to me.
I attempted to keep my PR as simple as possible without rewriting as much code. In the case of the resolve_dependencies
function, Click parses either the CLI flags (-/--
) or envvars (PYINSP_INDEX_URL
) and then passes them as arguments to that function, so there isn't a global variable hanging around. With a global settings object, instead of directly going to resolving dependencies, the function can then receive the arguments and place them in the global settings object and then go from there.
An example pip.conf with sensitive variables changed to look like BASH variables:
These are actually set using the
pip config set global.extra-index-url
so they are presumably following the correct convention.The file is located at
$HOME/.config/pip/pip.conf
following the XDG_CONFIG_HOME even when it is located in the legacy location$HOME/.pip/pip.conf
it is not respected either.The pip documentation in general.
EDIT: It also does not respect the relevant environment variables either