getporter / porter

Porter enables you to package your application artifact, client tools, configuration and deployment logic together as an installer that you can distribute, and install with a single command.
https://porter.sh
Apache License 2.0
1.22k stars 202 forks source link

Porter publish rebuild the image instead of just pushing it #2391

Closed LvffY closed 1 year ago

LvffY commented 1 year ago

Describe the bug

My porter bundle is correctly building. But for the build to pass correctly, I need to pass some --build-arg (in particular a GIT_TOKEN).

During my buid everything runs fine. Then during the publish, for no reason, the porter publish command tries to rebuid my image. Because I can't have --build-arg at this step, my build fails.

I'm thinking of a bug here, because just set some retries on my task make it pass and runs correctly (i.e just push the image).

To Reproduce

Steps to reproduce the behavior:

  1. Run
    curl -L https://cdn.porter.sh/latest/install-linux.sh | bash
    export PATH=~/.porter
    porter mixins install --url https://github.com/LvffY/ansible-mixin/releases/download ansible
    porter mixins install --feed-url https://mchorfa.github.io/porter-helm3/atom.xml helm3
    porter mixins install kubernetes
    porter build --build-arg GIT_TOKEN="This is a secret for build time !" --build-arg BUNDLE_USER=porter
  2. Use this porter.yaml & Dockerfile.tmpl
Dockerfile.tmpl

```Dockerfile FROM ubuntu:20.04 ## Porter init steps ## See https://porter.sh/bundle/custom-dockerfile/#porter_init # PORTER_INIT ## Git token to use to clone ARG GIT_TOKEN ## Mozilla SOPS version ARG SOPS_VERSION=3.7.3 ## Git script to generate password non interactively ## See https://git-scm.com/docs/gitcredentials#_requesting_credentials ENV GIT_ASKPASS=$BUNDLE_DIR/git_askpass.sh \ ## Ensure path contains python executable for the porter user PATH=$PATH:/home/${BUNDLE_USER}/.local/bin \ ## Avoid noisy depreciation warnings in python libraries not under our responsabilities PYTHONWARNINGS="ignore" \ ## Ensure Ansible will find installed collections ANSIBLE_COLLECTIONS_PATH="$BUNDLE_DIR/collections" RUN apt-get update && \ ## Install all necessary tools apt-get install -y --no-install-recommends wget gnupg2 python3 python3-dev python3-pip python3-venv git libpq-dev build-essential apt-transport-https ca-certificates && \ ## Install SOPS wget https://github.com/mozilla/sops/releases/download/v$SOPS_VERSION/sops_${SOPS_VERSION}_amd64.deb && \ dpkg -i sops_${SOPS_VERSION}_amd64.deb && \ ## Ensure linux packages are up-to-date \ apt-get dist-upgrade -y && \ ## Because helm mixins install the executable under /usr/local/bin/helm3, we create a symbolic link to avoid too many configurations in our ansible playbooks. ## During the installation process, this link is broken. ## When helm mixins is installed this link will work. ln -sf /usr/local/bin/helm3 /usr/local/bin/helm && \ ## Clean up image apt-get autoremove && \ apt-get autoclean && \ rm -rf /var/lib/apt/lists/* sops_${SOPS_VERSION}_amd64.deb # Use the BUNDLE_DIR build argument to copy files into the bundle COPY . $BUNDLE_DIR ## Clone playbook repositories RUN echo "${GIT_TOKEN:?GIT_TOKEN is not set, please set it.}" ## Porter mixins install ## See https://porter.sh/bundle/custom-dockerfile/#porter_mixins # PORTER_MIXINS ```

porter.yaml

```yaml name: porter/reproduce-porter-error version: 0.0.9 description: "Bundle to deploy our demo apps" registry: placeholder dockerfile: Dockerfile.tmpl schemaVersion: 1.0.0-alpha.1 mixins: - exec - ansible: otherPipDependencies: - virtualenv - psycopg2 - openshift ## No explicit use of these mixins but we use this config to don't have to know how to install kubectl & helm - kubernetes - helm3: clientVersion: v3.10.0 parameters: - name: git-token description: Git token to use to clone ArgoCD repository type: string env: GIT_TOKEN sensitive: true install: - exec: description: "Description of the command" command: echo # The command to run, must be on the PATH arguments: # arguments to pass to the command - "Hello install !" upgrade: - exec: description: "Description of the command" command: echo # The command to run, must be on the PATH arguments: # arguments to pass to the command - "Hello upgrade !" uninstall: - exec: description: "Description of the command" command: echo # The command to run, must be on the PATH arguments: # arguments to pass to the command - "Hello uninstall !" ```

  1. Run this porter command ...
porter publish --registry="my_acr.azurecr.io" --force ## Yes I'm using an ACR, may be the error comes from here ?
  1. See error

Expected behavior

The porter publish shoud just push my image and never rebuild it.

Porter Command and Output

  1. porter build command (went well)

porter-build.txt

  1. porter publish (went wrong then push my previously built image)

porter-publish.txt

Version

Copy the output of porter version below

porter v1.0.0 (89ad7ed6)

Other hypothesis

My image is kind of big (2GB+). May be the issue could some kind of caches that takes a long time to update, so the publish tries to rebuild an unfound image (in its cache) then finally found it after some retries ?

Azure pipelines

I don't think this a workstation issue, because I'm runnning my builds in Azure pipeline in some Microsoft hosted agents (i.e kind of very general VMs and that could be considered clean every time).

For information, here are the relevant steps on my pipeline :

azure-pipelines.yaml

```yaml stages: - stage: snapshot displayName: Build snapshot porter app pool: vmImage: ubuntu-latest jobs: - job: snapshot displayName: Build snapshot porter app steps: - checkout: self - task: Docker@2 displayName: Login to ACR inputs: command: login containerRegistry: my_acr - task: Bash@3 displayName: Install porter inputs: targetType: inline script: | set -e curl -L https://cdn.porter.sh/latest/install-linux.sh | bash echo "##vso[task.setvariable variable=PATH]${PATH}:~/.porter" - task: Bash@3 displayName: Porter build inputs: targetType: inline script: | set -e porter mixins install --url https://github.com/LvffY/ansible-mixin/releases/download ansible porter mixins install --feed-url https://mchorfa.github.io/porter-helm3/atom.xml helm3 porter mixins install kubernetes porter build --build-arg GIT_TOKEN=$GIT_TOKEN --build-arg BUNDLE_USER=porter env: GIT_TOKEN: "This is a secret for build time !" - task: Bash@3 displayName: Porter publish retryCountOnTaskFailure: 10 inputs: targetType: inline script: | set -e porter publish --registry="my_acr.azurecr.io" --force ```

carolynvs commented 1 year ago

Porter detects if the bundle definition has changed since the last build and we do a JIT build before publish. Seems like we have at least one bug around --build-arg that is causing it to always detect that it's out of date, but I suspect that there are more ways to trigger a build when we don't truly need it.

Unpredictably rebuilding (incorrectly to boot because it's missing the build args), isn't acceptable, sorry about the bug...

Couple thoughts:

  1. We could introduce a setting to turn off the rebuild check. But honestly if we have such little faith that it's correct, I'd rather remove the behavior (or make it opt-in) though I'm unsure about how to go about changing behavior after 1.0 without causing further problems.
  2. If we keep the behavior, the publish command needs to expose all the same flags as build which... just makes me want to rip out the feature even more. 😂
  3. Rebuild isn't terribly smart, as it only detects a small subset of changes that indicate the build is stale. We have an open request to support detecting changes to any files included in the docker context (#1818).

@squillace @vdice Do you have any thoughts on this?

LvffY commented 1 year ago

Unpredictably rebuilding (incorrectly to boot because it's missing the build args), isn't acceptable, sorry about the bug...

@carolynvs no problem. I just had to search quite a long time to isolate the bug and found that a simple retry make it pass x).

TBH I'm more fan of publish to expose the same as build. In my mind, in my pipelines at least, I would just have to run porter publish to build & publish my bundle.

carolynvs commented 1 year ago

Here's what I think would work to make sure autobuild doesn't have false positives (rebuilds when nothing has changed) and doesn't ever helpfully rebuild with the wrong settings:

Opt-out

See #2531 for the tracking issue Add a configuration setting that disables the autobuild entirely, letting people opt out when it isn't the desired behavior. The setting must default to false and be worded such that the grammar works out. So for example, we can't do --autobuild and have it default to true, but something like --disable-autobuild would be fine. The setting will be a flag on porter publish and other commands, and also available as an environment variable and configuration setting.

Remove false positive

Fix the autobuild check to more accurately detect when the bundle has changed so that it doesn't trigger a build when nothing is different (#2503). Right now something is causing porter to think a rebuild is necessary and the users disagree. We don't have to perfect rebuild in all cases, but we shouldn't rebuild when nothing is different.

Mimic Docker better

This is not strictly needed but would be a welcome improvement to autobuild.

Improve the autobuild check to be consistent with docker build, i.e. take into account all the files in the docker context that could have changed (and trigger an autobuild) instead of only porter.yaml. That way modifying helpers.sh or a terraform module would also be detected and a build triggered.

Right now we store a hash of porter.yaml in the generated bundle.json and use that to tell when an autobuild is required We could store additional information about the build context (e.g. last build time, hash of files, were custom build flags used) in the bundle directory and exclude them from the docker context so that we can use that to determine if we should build without adding unnecessary info to the bundle.

Disable autobuild when custom build flags are used

When the user built a bundle with custom flags, do not autobuild from other commands such as explain, install, inspect, etc. Instead porter should halt and instruct the user to repeat the build command with the desired custom flags specified.

Alternatives that were considered:

I'll let this suggestion sit for a bit to gather feedback and then split these up into separate issues.

LvffY commented 1 year ago

I agree with everything. Actually, just remove false positive would be enough to me, but the mimic docker better is a great idea.

tamirkamara commented 1 year ago

I think this is a good plan. I imagine that we'll turn off autobuild for the near future.

carolynvs commented 1 year ago

I have created a new issue that hopefully explains the opt-out config flag in enough detail that if anyone is interested in contributing, it should be ready for someone to implement. If not, please feel free to ask questions on the issue or open a draft PR and ask there.

https://github.com/getporter/porter/issues/2531

schristoff commented 1 year ago

Hey! It looks like this issue was fixed by #2733 - I'm going to close this issue for now. Thank you for your contributions!