snowballstem / snowball

Snowball compiler and stemming algorithms
https://snowballstem.org/
BSD 3-Clause "New" or "Revised" License
748 stars 173 forks source link

Added python_requires for Python package; Removed Python 2.6 #192

Closed andy-maier closed 7 months ago

andy-maier commented 8 months ago

For details, see the commit message. This addresses issue #191.

I did not know where to update the change history, as the NEWS file seems to have only the changes for released versions, but not for those in development. I think something like this should be added to the NEWS file once released:

Python
-------

* Specified supported Python versions via `python_requires` so that they can be
  recognized by Python installers.

* Removed Python 2.6 from the versions documented in the Trove classifiers.
ojwb commented 8 months ago

Note there was previous discussion of this in #158 where we concluded adding this probably wasn't actually useful, but I'm happy to revisit.

3.3 is meant to be supported, see doc/libstemmer_python_README:

Python 3 (>= 3.3) is supported. We no longer actively support Python 2 as the Python developers stopped supporting it at the start of 2020. Snowball 2.1.0 was the last release to officially support Python 2.

Looks like 3.3 is missing from the trove classifiers - seems it was just missed in #95 and nobody has noticed until now. Maybe that PR just went off what we were testing in CI at the time.

FWIW I believe the only reason it doesn't work with earlier Python 3 is that we generate a u prefix on Unicode string literals to produce code that works with both Python 2 and Python 3, but Python 3 didn't allow that until 3.3.

The status of Python 2 support is that we're no longer actively supporting it (in particular because the CI testing infrastructure no longer supports an Ubuntu version which has a Python 2 package), but we would likely still accept clean patches, and as far as we know it still works so I think it's appropriate to list it here. Snowball is a project that's going to be at the bottom of dependency chains so we should be conservative about dropping support for older versions.

I don't really see a good reason to omit 2.6 given it should still work. The probability it's still being used is likely low and declining, but there's no good reason to deliberately prevent it working by specifying a tighter dependency than reality assuming >= 2.6 works.

I generally update NEWS before each release based on the git log entries since the last release tag, so writing good commit messages should be enough for that.

ojwb commented 7 months ago

Looks like 3.3 is missing from the trove classifiers

I've now fixed that.

ojwb commented 7 months ago

@mitya57 What are your thoughts?

You raised adding python_requires before in #158 with the rationale:

this will cause pip to install older versions of snowball on older versions of Python.

As noted in the old ticket, I believe the only restriction is due to generating string literals with a u prefix e.g. u"foo" which Python 3 didn't support until Python 3.3.0, and all versions of Snowball with Python support actually support the same set of Python versions. We seem to have concluded before not to bother but it does seem cleaner to have this explicitly specified.

It seems the accurate specification would be this (since actually any Python 2 should work):

python_requires='!=3.0.*, !=3.1.*, !=3.2.*'

As things are now if someone did try to install snowball using pip on e.g. Python 3.2 they would get the latest version installed but it will fail to work. AIUI with this change they would instead get a version of snowball from before we made this change which would still fail to work. This seems a bit of an odd outcome but realistically I don't think it's a case we need to worry greatly about at this point (3.2.x is approaching 8 years past EOL).

mitya57 commented 7 months ago

I don't think anyone cares about Python 2.6 in 2024, so it should be absolutely fine to merge this.

We can also go one step further and declare dependency on >= 3.4. People who still use Python 2.7 can probably stick to snowball version that they currently have installed.

Or we can go two steps further and do what I do in my projects: support only those Python versions that can be tested automatically, i.e. in GitHub Actions or another CI system.

ojwb commented 7 months ago

As I said above already, I don't see a reason to deliberately prevent installation on older versions just because they're old. Better to document what we understand the requirements to actually be rather than try to enforce something stricter.

Sure there's no hard guarantee that new versions of Snowball will generate code that works with Python 2 (similarly for Python 3 versions we can no longer easily test in CI) but Snowball's generated code is pretty straightforward Python that isn't likely to stop working with older versions without a deliberate decision to change the generation to make that happen - e.g. we could change to generate literal strings as "foo" instead of u"foo" and/or slightly simplify the code generated for integer division (which none of the shipped algorithms uses anyway), but neither change has any tangible benefit.

I'm going to infer from your reply that you're happy with the slightly odd "installs an older non-working version" side-effect, since all your suggested options have that too.

I also think the change as proposed isn't quite right as it just seems to set a variable whereas the documentation says this should be a named parameter in the setup(...) call.

I'll push a change to specify python_requires='!=3.0.*, !=3.1.*, !=3.2.*' as a named parameter.