Open avelin opened 1 week ago
Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey. Please be sure to review our Code of Conduct. Also, check out some of our community resources including:
There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!
Description
Hi,
First of all, thanks for writing and maintaining SaltStack!
The problem with Python package names in general
Historically, there have always been some issues regarding the several different types of names for Python packages:
The main issue revolves around the need for at least the last kind of name - the one used for files and directories on the filesystem - to only contain characters valid for a Python identifier, so that an
import
statement will be able to handle it. The only "special" characters allowed there are the dot and the underscore, with even the dot posing problems in some OS environments. Over the years, the various package installers (e.g. pip) and package repositories (e.g. PyPI, devpi, etc.) have allowed the package metadata and the package repositories to specify names with additional characters, most notably dashes, and sometimes provide (more or less automatically) some kind of correspondence or a mapping algorithm to "normalize" these names to ones only containing alphanumeric characters, underscores, and possibly dots.PEP 503 - the recommended normalization algorithm
Back in 2015, a Python Enhancement Proposal - PEP 503 - was accepted that defines and strongly recommends what has become more or less the canonical algorithm for the normalization of Python package names. That algorithm is succintly explained in the Python Packaging User Guide which also provides the sample Python implementation from the PEP and recommends that this normalization be performed on any package names (specified by the user, fetched from a package index, extracted from a package metadata, or listed as installed by a local package manager) before any comparisons are made and any decisions are taken whether to install or upgrade any of them.
The problem with Python package names in SaltStack
The
pip.installed
state allows Python modules to be installed from a variety of sources, including PyPI. If apip.installed
state specifies in its list of packages a name that may be found in the source, but is not the same as the name from the package metadata, then the state will download the package, invoke thepip
tool to install it, but will then be unable to verify the installation: a warning will be returned saying that the package could not be found after the installation. Even worse, in later runs thepip.installed
state will not be able to find the package among the ones installed, so it will needlessly download and install it again and also report its own status as "changed", so later states may also perform unnecessary work.The disconnect comes from the fact that the user-supplied package name is not the same as the one obtained from asking the
pip
tool for the list of installed packages.An example of that may be seen with the following Salt configuration:
The proposed solution - normalize package names in several places
One possible solution to this problem is to modify both the
pip
Salt module and thepip.installed
Salt state as follows:normalize
function to thepip
module that converts the supplied package name string to the canonical form as stated in PEP 503 and in the Python Packaging User Guidepip.list
always return normalized package names (this is a change to the current behavior!) and also normalize the names before comparing them if a prefix is specifiedpip.installed
state normalize the supplied package names and look for their canonical forms in thepip.list
output (that is already normalized) so that the packages will be found if already installed and will also be found after they are installed or upgraded as necessaryA minor drawback of this proposed solution is that any modules or states that use the list of package names returned by
pip.list
may now see slightly different output. However, I believe that this should not cause any real problems, at least not any more than in the current situation - right now, the handling of Python modules with discrepancies between the different kinds of names is already suboptimal.Setup
Steps to Reproduce the behavior
Comment: There was no error installing package 'requests_ntlm' although it does not show when calling 'pip.freeze'.
Expected behavior
The package is installed with no warnings.
Versions Report
salt --versions-report
```yaml Salt Version: Salt: 3006.9 Python Version: Python: 3.10.14 (main, Jun 26 2024, 11:44:37) [GCC 11.2.0] Dependency Versions: cffi: 1.14.6 cherrypy: unknown cryptography: 42.0.5 dateutil: 2.8.1 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 3.1.4 libgit2: Not Installed looseversion: 1.0.2 M2Crypto: Not Installed Mako: Not Installed msgpack: 1.0.2 msgpack-pure: Not Installed mysql-python: Not Installed packaging: 22.0 pycparser: 2.21 pycrypto: Not Installed pycryptodome: 3.19.1 pygit2: Not Installed python-gnupg: 0.4.8 PyYAML: 6.0.1 PyZMQ: 23.2.0 relenv: 0.17.0 smmap: Not Installed timelib: 0.2.4 Tornado: 4.5.3 ZMQ: 4.3.4 System Versions: dist: debian 12 bookworm locale: utf-8 machine: x86_64 release: 6.1.0-23-amd64 system: Linux version: Debian GNU/Linux 12 bookworm ```