Open s-t-e-v-e-n-k opened 1 year ago
I wonder whether https://github.com/openSUSE/python-rpm-macros/pull/151 is involved somehow.
@Vogtinator, @bnavigator, and @bmwiedemann might be interested as well.
This should support existing spec files since calling fdupes a second time should result in no changes
There could be some edge cases like %fdupes -s
converting hardlinks to symlinks, but I recommend against %fdupes -s
anyway...
I think all python spec files now don't use %fdupes -s
I think all python spec files now don't use %fdupes -s
Just for the record (none of these are much relevant for anything):
milic~/b/spec_factory$ rg -l 'fdupes -s' python-*
python-youtube-dl.spec
python-sip6.spec
python-pytest-expect.spec
python-pymediainfo.spec
python-OWSLib.spec
python-mkdocs.spec
python-livereload.spec
python-gphoto2.spec
python-ghp-import.spec
python-cangjie.spec
python-apipkg.spec
milic~/b/spec_factory$
However, it is really not much relevant for this discussion (delegated to Trello).
python-sip6.spec
That has %fdupes -s doc
and is irrelevant for python
In the last years, timestamps are mostly not a problem for .pyc files, because rpmbuild
normalizes mtime via %clamp_mtime_to_source_date_epoch Y
and .pyc headers usually default to checksum mode with SOURCE_DATE_EPOCH
set.
What is a problem are variations (e.g. from ASLR) that go into .pyc files. Because .pyc files are memory-dumps of internal python state, they are hard to make properly reproducible.
e.g.
> cat test.py
lext = {'png', 'gif', 'jpg', 'pcx', 'pnm', 'tif', 'xpm'}
> for i in $(seq 10) ; do setarch -R python3.10 -m py_compile test.py ; md5sum __pycache__/test.cpython-310.pyc
done|sort -u|wc -l
10
This is actually not directly about reproducible builds but about fixing a packaging issue: https://bugzilla.suse.com/show_bug.cgi?id=1207805
Bug is non-public
In the last years, timestamps are mostly not a problem for .pyc files, because
rpmbuild
normalizes mtime via%clamp_mtime_to_source_date_epoch Y
and .pyc headers usually default to checksum mode withSOURCE_DATE_EPOCH
set.
@bmwiedemann, is that also true for SLE/Leap? The py39 custom repos for 15.X builds always throw rpmlint errors because that one doesn't like the mtimes of the .pyc files. I ignore them.
Bug is non-public
I am sorry about that, I tried to add you to the bug so it should be visible at least to you. Does it work?
I can read it now.
Taking @bmwiedemann's input into account, I suspect this is a SLE/Leap only issue for Python 3.6 generated .pyc files.
%#FLAVOR#_compile
is already part of %#FLAVOR#_pyproject_install
. No harm in adding %fdupes
before that call, so that duplicate source files get the same mtime. Legacy %python_install
would need to get a bit more attention.
Alternative suggestion: Add -n
to the %fdupes
calls. I think in 99% of the cases, where identical .pyc files get deduplicated are empty __init__.py
files.
Alternative suggestion: Add
-n
to the%fdupes
calls. I think in 99% of the cases, where identical .pyc files get deduplicated are empty__init__.py
files.
no — 99% of cases are identical files for opt-0 and opt-1 (which make rpmlint unhappy as there are usually a lot of them)
But those have always the same mtime and have never been a problem for reproducible builds due to deduplication.
Yeah, the issue is only about *.py files getting replaced by hardlinks. Replacing .pyc files with hardlinks is fine, their mtime is not used.
Trying to come up with a proof concept here, but the problem is that if you run %fdupes
and then use compileall, the hardlinks get replaced. Still trying to think of a good solution.
Really interested in your proof here. Because your last comment directly contradicts your concept of the initial post (https://github.com/openSUSE/python-rpm-macros/issues/156#issue-1600814151)
Barely any packages calls %fdupes before the python compile, and if it does this it is only fixable in the package specfile not in the python macros.
That was an evil idea, not a fiat accompli -- I'm still trying tings out, but I welcome your input.
We are currently bad at reproducible builds, since we run install, then run fdupes, and that results in inconsistent timestamps in the pyc files versus the filesystem timestamp. My evil idea:
This should support existing spec files since calling fdupes a second time should result in no changes, and we can drop them when we get around to them.