ICRAR / crc32c

A python package implementing the crc32c algorithm in hardware and software
GNU Lesser General Public License v2.1
41 stars 25 forks source link

Add a hashlib-esque wrapper class + slight general cleanup which was necessary #50

Closed jonded94 closed 3 months ago

jonded94 commented 3 months ago

Now that this project is implemented as a python package (instead of a single-file distribution), I thought about adding some nice pythonic sugar on to it to make using it a bit nicer.

Internally, we're using this project to calculate hashes of large files and wrap it in something which already basically conforms to the interfaces given by the stdlib library hashlib.

This means that there are some wrapper classes with helper functions such as digest, hexdigest, update, .. More info can be found here:

Instead of implementing this purely on our side, I asked myself why not to bring it upstream. The tests already are using modern pytest syntax and I also included additional type information in the stub file.

Let me know what you think about it.

Summary by Sourcery

Introduce a Crc32cHash class with a hashlib-like interface for CRC32C hashing, enhance type annotations, and add comprehensive tests for the new class.

New Features:

Enhancements:

Tests:

sourcery-ai[bot] commented 3 months ago

Reviewer's Guide by Sourcery

This pull request introduces a new Crc32cHash class that wraps the crc32c functionality with a hashlib-like interface. The class includes methods for updating data, generating digests, and copying instances. Corresponding type annotations have been added. Additionally, comprehensive tests have been implemented to ensure the correctness of the new class.

File-Level Changes

Files Changes
src/crc32c/__init__.py
src/crc32c/__init__.pyi
Introduced a new Crc32cHash class with methods and properties to wrap the crc32c functionality, and added corresponding type annotations.
test/test_crc32c.py Added comprehensive tests for the new Crc32cHash class, including basic property checks, copy functionality, and specific value tests.

Tips - Trigger a new Sourcery review by commenting `@sourcery-ai review` on the pull request. - Continue your discussion with Sourcery by replying directly to review comments. - You can change your review settings at any time by accessing your [dashboard](https://app.sourcery.ai): - Enable or disable the Sourcery-generated pull request summary or reviewer's guide; - Change the review language; - You can always [contact us](mailto:support@sourcery.ai) if you have any questions or feedback.
jonded94 commented 3 months ago

Okay, so, as you can see, a bit more commits flew in.

I took the "bold" decision to really drop support for Python <3.7 because it enables quite a bunch of much cleaner implementation structures.

Note that Python 3.6 is EOL since 2021-12-23 (and Python 3.7 is EOL since 2023-06-27, but supporting it is not that hard), so I think this really is not a too bad thing to do. Rocking a Python version this old is bad even just from a security perspective.

This enabled me to use stub files only for the shared library part which simply has no other means of type hinting. All the other Python code is type hinted directly.

The collections.abc.Buffer thing is solved by using typing_extensions, but behind a if typing.TYPE_CHECKING block. So no runtime dependency was introduced here.

You will notice that there are some other changes that are more of a general cleanup. I'd really like to see them included and not again be moved out. Separating every little change in separate PRs is just of no value here (IMHO), compared to just do one, slightly larger cleanup in one go.

EDIT: Notice that mypy, with the changes I introduced, is now also able to test the entire src and test directory instead of a single file.

jonded94 commented 3 months ago

I'm a bit unsure about this .travis.yml file. AFAIK, it's not used anywhere. I touched it in this PR, but can it just be removed entirely?

rtobar commented 3 months ago

@jonded94 thanks for the further updates.

Indeed I would ideally have requested some of changes to come in separate PRs -- at least the general black + cleanup, and the Python 3.7 update. Unlike you, I do see value in doing these separately, as it allows discussions and changesets to be more focused, and helps with better bookkeeping. You are moving fast though, are very eager, and I'm currently on parental leave, so I can't act as fast as I'd like to. I thus prefer to be practical and don't hinder progress.

I see nothing wrong with the further changes in principle. The push for 3.7+ is fine, and indeed I did ponder about doing it when dropping 2.7 the other day, but seeing how we still only had the C extension to annotate there was no further gain from restricting 3.x versions. I'll again take the liberty to force-push some small things (again, just trying to be practical). For instance, now that there's formatting, CI should enforce it, otherwise there's no point. I'll also add some text to the CHANGELOG and README, fix some minor things around docs, things like that. I'll also delete the .travis.yaml file, it hasn't been used since we moved from Travis to GHA.

rtobar commented 3 months ago

OK, I've addressed the points I mentioned above and force pushed, see diff in https://github.com/jonded94/crc32c/compare/e8451c4..add-hashlib-esque-wrapper-class. I re-arranged commits slightly so they are self-contained, and in a way that changes are introduced in the correct order (declare 3.7+ compatibility first, then use 3.7+ features). Other than what I mentioned before, I also removed the __all__ declaration in __init__.py, and opted for using the from X import Y as Y that makes mypy happy regarding names being exported, while removing the ability to do from crc32c import *.

Let me know if you're happy with the changes, or if there's something else you think is missing. If happy, I'll merge and release.

jonded94 commented 3 months ago

Hi @rtobar, thank you very much for your very fast responses and much effort that goes into these PRs!

I re-arranged commits slightly so they

Sorry that you had to do this, I usually push a lot of WIP commits, then let the review happen, implement necessary changes and then squash/split up into self-contained commits. Thank you for cleaning up my work!

This all looks great, very happy to have this in. Thank you for the great, productive time doing this!

Hope you can concentrate on your probably well deserved parental leave time.