scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.
https://scrapy.org
BSD 3-Clause "New" or "Revised" License
51.16k stars 10.35k forks source link

Make the build reproducible #5019

Closed lamby closed 2 weeks ago

lamby commented 3 years ago

Whilst working on the Reproducible Builds effort I noticed that scrapy could not be built reproducibly

This is due to the documentation embedding the current build year in the generated files, therefore making the build vary depending on when you build it. The fix is to use SOURCE_DATE_EPOCH if it is exported to the surrounding environment.

I originally filed this in Debian as bug #983852.

bmwiedemann commented 3 years ago

Auto-updating copyright at build time is a bad practice anyway, because building today's software next year, would produce copyright 2008–2022, Scrapy developers without any developer doing anything in 2022.

so the better fix could be to state copyright 2008 or copyright 2008-2021

Gallaecio commented 3 years ago

I agree that it’s a bad practice, only better than not updating it at all.

The best I can think of at the moment is to go for a pre-commit hook that updates the year if it is out of date. But then you need to get contributors to install pre-commit.

Alternatively, we could use https://github.com/c4urself/bump2version/issues/133 once implemented to make sure we update the year before every release.

Gallaecio commented 3 years ago

Closing and reopening to re-run the tests with the latest changes from master

codecov[bot] commented 3 years ago

Codecov Report

Merging #5019 (c74703c) into master (f95ebd8) will decrease coverage by 0.35%. The diff coverage is 90.31%.

:exclamation: Current head c74703c differs from pull request most recent head 34ea7b8. Consider uploading reports for the commit 34ea7b8 to get more accurate results

@@            Coverage Diff             @@
##           master    #5019      +/-   ##
==========================================
- Coverage   88.01%   87.65%   -0.36%     
==========================================
  Files         158      162       +4     
  Lines        9726    10301     +575     
  Branches     1433     1501      +68     
==========================================
+ Hits         8560     9029     +469     
- Misses        911     1000      +89     
- Partials      255      272      +17     
Impacted Files Coverage Δ
scrapy/utils/log.py 89.24% <ø> (ø)
scrapy/pipelines/images.py 90.35% <80.00%> (-1.47%) :arrow_down:
scrapy/core/http2/protocol.py 83.41% <83.41%> (ø)
scrapy/core/downloader/contextfactory.py 87.03% <84.61%> (-2.97%) :arrow_down:
scrapy/core/http2/stream.py 91.37% <91.37%> (ø)
scrapy/core/downloader/handlers/http11.py 93.68% <95.00%> (+1.22%) :arrow_up:
scrapy/core/http2/agent.py 96.38% <96.38%> (ø)
scrapy/core/downloader/handlers/http2.py 100.00% <100.00%> (ø)
scrapy/selector/unified.py 100.00% <100.00%> (ø)
scrapy/signals.py 100.00% <100.00%> (ø)
... and 16 more