monperrus / crawler-user-agents

Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome :star:
MIT License
1.17k stars 249 forks source link

Add and use `requirements.txt` #357

Closed ericcornelissen closed 5 months ago

ericcornelissen commented 5 months ago

NOTE: I'm not very experienced with Python dependency management and pip. My goal here is similar to #351 but for pip.

This adds a requirements.txt file to pin Python dependencies, direct and transitive. This was achieved by, in a clean Docker image, first running pip3 freeze to get a list of irrelevant packages, then running the existing pip3 install jsonschema pytest (from CI) command, and finally running pip3 freeze again to get this project's dependencies (by omitting those from the first pip3 freeze list).

Both CIs have been updated to use the requirements.txt file to install dependencies. Besides improving reproducibility, this also avoids duplication.

The benefit of doing this is that the same versions of Python dependencies will always be used for this project (assuming pip respects the requirements). If the registry is trusted you can also be sure that the same source code is always run (however the absence of local checksums means this isn't the case if the registry isn't trusted).

monperrus commented 5 months ago

thanks a lot @ericcornelissen