hauler-dev / hauler

Airgap Container Swiss Army Knife
https://hauler.dev
Apache License 2.0
119 stars 28 forks source link

[feature] Ability to merge Airgapped Hauler registrys #237

Open PeeBee66 opened 4 months ago

PeeBee66 commented 4 months ago

Is this RFE related to an Existing Problem? If so, please describe: This feature request addresses the need to merge Hauler registries. For example, if I download version 1.0 of an application consisting of 20 containers, along with 10 core containers such as Redis, ELK, and RabbitMQ, and deploy them in an air-gapped environment using Hauler, I want the ability to seamlessly integrate version 1.1 containers with the existing registry. This avoids the need to transfer over 30 GB of data each time I update, improving efficiency and reducing data transfer overheads.

Describe Proposed Solution(s): The proposed solution is to enable registry merging in Hauler. This feature would allow users to integrate updated container versions with existing registry stores effortlessly. Hauler would need to identify and reconcile differences between versions, ensuring that only new or modified containers are transferred and integrated, thereby minimizing data transfer requirements.

Describe Possible Alternatives: Alternatively, a differential update mechanism could be implemented. This approach would involve transmitting and applying only the changes between versions to the existing registry. Additionally, supporting incremental updates or patching of containers within the registry could offer a more granular and efficient solution.

Additional Context: Enabling registry merging capabilities in Hauler would significantly reduce data transfer overheads and simplify the process of updating containerized applications in air-gapped environments. This enhancement would improve the usability and efficiency of Hauler for managing container registries in isolated or bandwidth-constrained environments.

zackbradys commented 3 months ago

hey @PeeBee66, thanks for taking the time to submit this RFE. we have had the idea of improving and implementing deltas in hauler for awhile now, but it is a significant level of effort so it has not been worked on yet. we will keep this RFE open until we are able to provide additional information and an implementation timeline!

dweomer commented 3 months ago

As things stand, merges are achievable provided one can tolerate the stipulation that for tags its a "last one in wins" operating regime (unless your target registry is write-once per tag and then you've got a potential data problem).

We have a setup where I manually test a fork of rancher/rancher, for different versions thereof, in an "airgapped" environment (it is bridged by a NAT-ed "proxy" where I can stage content, aka pull/sync and push/copy to the only accessible registry in the private network). When I have to work with a new version of Rancher I make sure to leverage Hauler to sync down all containers for the target version and then copy them to a private Harbor installation. For content that already exists in Harbor, the push/copy process is implemented correctly in that content will not be pushed (resources are checked for on the server via the expected HTTP HEAD/GET methods). I can tell this is happening because I will see messages for digests that are skipped after hauler/cosign verifies that the content is already present on the server.


Re-reading your proposal, it seems as if what you are really asking for is a solution to avoid shipping any content over the diode that already exists at target location(s). As in, I think we are talking deltas here, yeah? With "deltas" encapsulating the concept of differences from a known starting point.