vmware-archive / salt-pack

Salt Package Builder
Apache License 2.0
55 stars 23 forks source link

salt repo broken due to hash sum mismatch #594

Closed ericrasmussen closed 4 years ago

ericrasmussen commented 5 years ago

http://repo.saltstack.com/py3/ubuntu/16.04/amd64/2018.3 is broken and I can no longer install salt-* packages.

Here is the error:

"Failed to fetch http://repo.saltstack.com/py3/ubuntu/16.04/amd64/2018.3/pool/main/p/python-tornado/python3-tornado_4.2.1-2~ds+1_amd64.deb Hash Sum mismatch"

ericrasmussen commented 5 years ago

Hi, I spent some more time looking into this and the mismatch was for the file usr/lib/python3/dist-packages/tornado/speedups.cpython-35m-x86_64-linux-gnu.so

It looks you recompiled it and then repackaged everything under the same name and version. This caused a mismatch with the cached version I had in apt-cacher-ng.

Can you let me know how often you recompile or change files in packages while releasing them under the same name and version? This breaks caching for anyone that uses it.

dmurphy18 commented 5 years ago

@ericrasmussen The packages are build via states/execution modules for various platforms and they attempt to utilize the pre-existing packages when building packages, both for nightly builds and tagged release builds, such that previously built package's hashes are unchanged but there is no guarantee that they might not be rebuilt. For example: SaltStack is currently in the middle of moving from one infrastructure to a different infrastructure to build/test/package/test PRs/etc. Hence packages hashes can change.

Noting that, rebuilding is expected to change hashes since tools used to build may have been updated, either due to CVE, performance improvements, etc. Just think versions of gcc, libc, Python updates, etc.

The issue of package caching and changing hashes has been brought up before, but in order for that to be achievable, it would imply a highly static build environment, but even that would be affected by CVEs.

Long story short, SaltStack tries to utilize the same built packages between releases and keep changes to a minimum, but it is not guaranteed, best effort.

The tools for building the packages are open-sourced in the following GitHub repositories. https://github.com/saltstack/salt-pack https://github.com/saltstack/salt-pack-py3 https://github.com/saltstack/salt-auto-pack

Improvements and PR's gladly accepted

ericrasmussen commented 5 years ago

Thanks for getting back to me.

This only affects caching when the contents of a package change but its name and version stay exactly the same. This doesn't tend to be an issue for ubuntu PPAs because in the case of CVEs or other fixes, they update the version.

If these packages are going to be regenerated regularly with different contents, it would be good to add a suffix to the version to indicate a new build (e.g. package foo is version 2.1 so you have 2.1-0 for the first build, 2.1-1 for the second, etc).

But it sounds like this is mostly an unusual case caused by moving to a new infrastructure?

dmurphy18 commented 5 years ago

Actually the packages may get regenerated but not with different contents, just built on a different machines, hence the source has not changed, so no need to change version. Just built on a different machine instantiated in the cloud (using locked down images from SaltStack for the platform to try to ensure consistency). Noting that SaltStack tries to only build changes, unless specially doing a clean build for everything (an extreme rarity - an example would be for a new branch or platform).

Changing version just because the software is built on a different machine is not sustainable nore a correct reflection of the contents of the package.

There are projects for building packages such that they are bit for bit the same independent of the machine, but they are in the development phase and are not ready for production use, indicating this is not just an issue for SaltStack.

Preference again is to try to utilize previously built packages in releases (best effort) but no guarantee.

ericrasmussen commented 5 years ago

If an ubuntu package is recompiled for any reason (e.g. security issue) and the source doesn't change, they still change the package name. They have suffixes to indicate different builds.

It's important to note that in this case the contents of python3-tornado did change because the speedups extension was recompiled. I realize you would not change the version of python3-tornado itself, but there still needs to be some indication that the contents of the package changed.

Again, this is not a problem with ubuntu packages because they do update the names even if the software version doesn't change.

Does that make sense? There needs to be some way to distinguish between software version and build version for cases like this.

I'm just not sure what you mean by not sustainable or correct. It's absolutely correct to provide some indication that the contents are no longer the same, even if the source hasn't changed. This is standard.

mswart commented 5 years ago

Once a binary package is published, it is never changed (hash changes), only eventually removed. This simply assumption is very fundamental around deb repositories as used by Ubuntu and Debian. Violation this assumption might not break apt itself, but many software written around apt repository.

If the repository is accessed via an apt repository cache like apt-cacher-ng, it will always deliver the old version without checking whether it was changed. This lets apt abort due to hash-mismatch errors.

Repository mirroring and management tools like reprepro and aptly fail to publish such upstream changes as this would require to replace an already published file.

Debian build tools like sbuild include shortcuts to build a unchanged source package but produce binary packages with increased version (--append-to-version, --make-binNMU).

These binary rebuilds without version change forced myself to frequent workarounds like temporary removing all salt packages or only importing the salt packages itself (in most cases only dependent packages are rebuild without change and they are often already part of the official repositories). As the built packages have the same version for all distributions prevents creating repositories that deliver salt packages for multiple distributions.

If the packaging infrastructure is not capable to prevent rebuilds directly, the builds could be suffixed with an unique identifier like a build id or timestamp.

dmurphy18 commented 5 years ago

@ericrasmussen @mswart Yes, I understand and stand corrected. Once the infrastructure move is completed I shall adjust the build framework, however, I cannot provide a time frame as to when the work will be completed. The build framework utilizes SaltStack's highstate, states and execution modules, hence have to adjust as to how to achieve the desirable outcome.

mswart commented 5 years ago

@dmurphy18 I am happy that this is treated as an open issue and will be fixed eventually.

dmurphy18 commented 5 years ago

@mswart Yes, thank you and @ericrasmussen for informing me as to correct operation.

The fix is in two parts: first prevent rebuilds from occurring when already built available (using published repo rather than internal nightly build repo) - easy second automating version increment when a package is rebuilt - harder, but had a project to update the automation for Debian/Ubuntu family and I should be able to tie this into it.

dmurphy18 commented 5 years ago

Running into a number of issues which are preventing a solution currently with automation as it currently stands. Issue list is as follows:

  1. copying from repo.saltstack.com/<apt|yum|py3>//version/arch/latest appears to work but running into issue picking of and rebuilding for minor release versions. For example: timelib_0.2..4-1.dsc and timelib_0.2.4-2.dsc which current debbuild.py, Also issue of timelib.0.2.4.orig.tar.gz will trip on timelib_0.2..4-1.dsc, when it is for timelib_0.2..4-2.dsc

  2. problem of copying older packages from repo.saltstack.com/<apt|yum|py3>//version/arch/latest which are no longer required, For example Debian 8, python-croniter_0.3.4-1.dsc was produced in older point releases, but is going to be removed for Debian 8.7 and above support since the OS already provides it. Need to adjust automation to only copy packages present in versions/ from repo.saltstack.com so older packages are no copied over. This still lives issues in item 1 to deal with

dmurphy18 commented 5 years ago

This should be fixed with the next point releases where Debian and Ubuntu packages / build products should remain unchanged if no changes have occurred to the dependencies.

Moved to blocked to test when next point release occurs

dmurphy18 commented 4 years ago

@ericrasmussen Can you check that this is fixed, since it should have been with Salt 2019.2.2 in Oct 2019. Want to close this old issue

ericrasmussen commented 4 years ago

The original issue was based on the salt PPAs serving the same filename with changed contents, so I'm not sure how I would verify it. As long as the build process was updated to avoid this scenario then I think we're all good. I haven't seen the problem again and I am good with the issue being closed.