peterbe / hashin

Helping you write hashed entries for packages in your requirements.txt
https://www.peterbe.com/plog/hashin
MIT License
105 stars 27 forks source link

Hashin support for not distinguishing between underscored/hyphenated #116

Closed caphrim007 closed 4 years ago

caphrim007 commented 4 years ago

pypi appears to follow guidelines outlined in PEP8 around package names preferring to hyphenate them as opposed to underscore them

https://www.python.org/dev/peps/pep-0008/#package-and-module-names

""" Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged. """

Note there is a distinction between a module and a package. hashin is operating on packages.

Regardless of the naming choices, pypi appears to support both underscores and hyphens in packages. This is likely being done so-as to prevent major breaking changes across existing software.

Consider, for example, if pypi drew a hard line on this, switched all package names to hyphens, and disallowed underscores; a tidal wave of broken software would exist based on this decision.

So it looks like pypi supports both and (may?) support both forever. However, when you use hashin to get the package of choice, you're always returned the "correct" one; ie, the hyphenated version.

There exists a case where a user may have an existing (underscored) package definition, and hashin should behave to correct this to use the hyphenated version.

For instance, if the user's requirements file has the following (truncated sha's)

python_memcached==1.59 \
    --hash=sha256:4dac64916871bd35502 \
    --hash=sha256:a2e28637be13ee0bf1a8

Then upon running either of the following

The expected change to the requirements file would be

python-memcached==1.60 \
    --hash=sha256:x00c64916871bd35502 \
    --hash=sha256:x02e28637be13ee0bf1a

Today, however, the behavior is that hashin will do the following.

This issue proposes that hashin normalize this behavior to prefer (in all cases) the hyphenated name. Therefore,

caphrim007 commented 4 years ago

@peterbe lemme know if this captures what we were talking about over here https://github.com/renovatebot/renovate/issues/2444

peterbe commented 4 years ago

I know about python-memcached but are there any packages in PyPI that appears listed there with an underscore. E.g if there was a https://pypi.org/project/some_lib/ and our users use hashin some-lib.

Perhaps, from reading your description that does not exist.

peterbe commented 4 years ago

I'm not convinced we should change the names of the packages if the version isn't different. It feels rather disruptive. E.g. You had python_memcached==1.59 ... in your requirements.txt what should happen if you run hashin python-memcached? (because as of today, version 1.59, is already the latest version).

I'm thinking we only correct (the name of the package in) requirements.txt IF the version changes. That's going to cause less git diffs for people.

However, and I think we're in agreement, if you find out that there's a new version of python-memcached that is version 1.60, running hashin -u -i or hashin python-memcached or hashin python_memcached the new requirements.txt should become:

-python_memcached==1.59 \
-    --hash=sha256:y10c64916871bd3550a \
-    --hash=sha256:y12e28637be13ee0bf1b
+python-memcached==1.60 \
+    --hash=sha256:x00c64916871bd35502 \
+    --hash=sha256:x02e28637be13ee0bf1a
caphrim007 commented 4 years ago

ahh. good point. I agree. if the version changes, then we would upgrade the name. otherwise, leaving it alone would be sufficient.

caphrim007 commented 4 years ago

Some other packages which are like this are readme_renderer and websocket_client.

peterbe commented 4 years ago

@caphrim007 Do you have the intention to take a stab at writing a pull request for this?

caphrim007 commented 4 years ago

@peterbe I do.

I spent some time looking at it the other day and think I can provide something for review within the next day or two. Happy to adjust it per your review.

Upon first looks, my guess is that the majority of the change is within the amend_requirements_content function since this is is called only after hashin has determined that changed-ness happens

caphrim007 commented 4 years ago

@peterbe ready for review

caphrim007 commented 4 years ago

@peterbe ping

peterbe commented 4 years ago

@peterbe ping

Back from 2 days of vacation and Mondays is my meetings day.