antonbabenko / pre-commit-terraform

pre-commit git hooks to take care of Terraform configurations 🇺🇦
MIT License
3.23k stars 541 forks source link

Unclear why need to apk install tar in github action #730

Open togakangaroo opened 3 weeks ago

togakangaroo commented 3 weeks ago

In the github action example there is the following somewhat quixotic line

apk --no-cache add tar

From what I can tell, without this, the post cache pre-commit step fails as it is trying to run

/bin/tar --posix -cf cache.tgz --exclude cache.tgz -P -C /__w/PaaPy-TF-ecr/PaaPy-TF-ecr --files-from manifest.txt -z
/bin/tar: unrecognized option: posix
BusyBox v1.3[5](https://github.com/.../actions/runs/11693148026/job/32564059333#step:10:5).0 (2022-11-19 10:13:10 UTC) multi-call binary.

which is happening because the version of tar included with BusyBox doesn't support the --posix option and therefore it pulls in the GNU one. That's confusing, but such is Linux.

What I don't get is why this needs to be done externally. As this seems to be necessary, couldn't that line just be moved into the Dockerfile and make things simpler for everyone using this? Is there some sort of downside I'm not seeing?

yermulnik commented 3 weeks ago

@MaxymVlasov probably knows for sure, while my guess is to reduce Dockerfile (and consequently container image size) by unneeded dependency when running outside GH Actions (e.g. locally).

MaxymVlasov commented 2 weeks ago

First of all, I think it requires to dockerize the full step, not just first part of it:

   - name: fix tar dependency in alpine container image
        run: |
          apk --no-cache add tar
          # check python modules installed versions
          python -m pip freeze --local

make things simpler for everyone using this

AFAIK, you're a second user that uses it in this way (or at least second one who complains about it :)

Regarding size. tar tooks just 0.6MB, python command add extra 9MB, so it's not a big deal from size perspective

REPOSITORY         vv                         TAG                   SIZE
python                                       3.12.0-alpine3.17      51.1MB
python-with-tar                              latest                 60.7MB

ghcr.io/antonbabenko/pre-commit-terraform    v1.96.2                1.22GB
our-image-with-tar                           latest                 1.22GB

Their installation in GH should be less than 5s, but it still time which can b avoided.

 => [2/3] RUN apk --no-cache add tar                       3.8s
 => [3/3] RUN python -m pip freeze --local                 1.3s

What more important in all of these - its security reasons. Each deps potentially provides bunch of CVEs, so it's good to have as least as possible of them

At the same time, I can't imagine how it can be a security issue for local installations, and looks like vital for CI.

Feel free to send PR which will change that logic

togakangaroo commented 2 weeks ago

@MaxymVlasov wouldn't everyone who uses this in github actions with cache as recommended run into the issue? I assume everyone is just copy-pasting the "example workflow" in the docs into their repos without asking questions.

The issue about CVEs is real, but tar is already in the container, this just switches out the busybox implementation for the gnu one does it not? So you might get different CVEs but they would also replace other ones. Also this is tar we're talking about, the last cve was >5 years ago and there's two total since it started being tracked. And even then, these don't seem like the sort of CVE that would be exploitable in this scenario. For the one 5 years ago for example someone would have had to craft malformed extended headers - but in this case the tar file is being created by the caching step, it's not user-supplied.

On another note, isn't the python -m pip freeze --local bit just some debugging info that's printed to stdout? I'm not sure why it is in the example at all. I've removed it in our version and its just fine.

MaxymVlasov commented 2 weeks ago

Actually, I never tried to use it this way, as my repos have much more than just terraform checks (or have anything except terraform, like https://github.com/antonbabenko/pre-commit-terraform/blob/master/.github/workflows/pre-commit.yaml itself), plus not all hooks needed, plus I want to be able to track and update version on my own by Renovate.

But maybe it is worth to spend some time to build image on top of pre-commit-terraform and use it in CI, I will figure it out when will have enough time for that and if it will make sense for bussiness.

Anyway, that's how I use it currently:

name: Common issues check

on: [pull_request]

env:
  # Prevent GH API rate-limit issue
  GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}

jobs:
  pre-commit:
    runs-on: large
    container: python:3.13-slim@sha256:751d8bece269ba9e672b3f2226050e7e6fb3f3da3408b5dcb5d415a054fcb061
    steps:
      - name: Install container pre-requirements
        run: |
          apt update
          apt install -y \
              git \
              curl \
              unzip \
              jq \
              shellcheck \
              nodejs # Needed for Terraform installation
          curl -L https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 > /usr/bin/yq &&\
            chmod +x /usr/bin/yq
      - name: Checkout
        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
        with:
          ref: ${{ github.base_ref }}

      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
      - run: |
          git config --global --add safe.directory /__w/infrastructure/infrastructure
          git fetch --no-tags --prune --depth=1 origin +refs/heads/*:refs/remotes/origin/*

      - name: Get changed files
        id: file_changes
        run: |
          export DIFF=$(git diff --name-only origin/${{ github.base_ref }} ${{ github.sha }})
          echo "Diff between ${{ github.base_ref }} and ${{ github.sha }}"
          echo "files=$( echo "$DIFF" | xargs echo )" >> $GITHUB_OUTPUT

      - name: TFLint cache plugin dir
        uses: actions/cache@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2
        with:
          path: ~/.tflint.d/plugins
          key: ubuntu-latest-tflint-${{ hashFiles('.tflint.hcl') }}

      - name: Setup TFLint
        uses: terraform-linters/setup-tflint@19a52fbac37dacb22a09518e4ef6ee234f2d4987 # v4.0.0
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}

      - name: Init TFLint
        run: tflint --init

      - name: Setup Terraform docs
        env:
        # renovate: datasource=github-releases depName=terraform-docs lookupName=terraform-docs/terraform-docs
          TERRAFORM_DOCS_VERSION: 0.19.0
        run: |
          curl -L https://github.com/terraform-docs/terraform-docs/releases/download/v${TERRAFORM_DOCS_VERSION}/terraform-docs-v${TERRAFORM_DOCS_VERSION}-linux-amd64.tar.gz > terraform-docs.tgz \
            && tar -xzf terraform-docs.tgz terraform-docs && rm terraform-docs.tgz \
            && chmod +x terraform-docs && mv terraform-docs /usr/bin/

      - name: Setup tfupdate
        env:
        # renovate: datasource=github-releases depName=tfupdate lookupName=minamijoyo/tfupdate
          TFUPDATE_VERSION: 0.8.5
        run: |
          curl -L https://github.com/minamijoyo/tfupdate/releases/download/v${TFUPDATE_VERSION}/tfupdate_${TFUPDATE_VERSION}_linux_amd64.tar.gz > tfupdate.tgz \
            && tar -xzf tfupdate.tgz tfupdate && rm tfupdate.tgz \
            && chmod +x tfupdate && mv tfupdate /usr/bin/

      - name: Install shfmt
        env:
        # renovate: datasource=github-releases depName=shfmt lookupName=mvdan/sh
          SHFMT_VERSION: 3.10.0
        run: |
          curl -L https://github.com/mvdan/sh/releases/download/v${SHFMT_VERSION}/shfmt_v${SHFMT_VERSION}_linux_amd64 > shfmt \
          && chmod +x shfmt && mv shfmt /usr/bin/

      - uses: hashicorp/setup-terraform@b9cd54a3c349d3f38e8881555d616ced269862dd # v3.1.2
        with:
          terraform_version: 1.5.7

      - name: Setup OPA
        uses: open-policy-agent/setup-opa@34a30e8a924d1b03ce2cf7abe97250bbb1f332b5 # v2.2.0
        with:
          version: latest

    # Need to success pre-commit fix push
      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
        with:
          fetch-depth: 0
          ref: ${{ github.event.pull_request.head.ref }}
        # Need to trigger pre-commit workflow on autofix commit
        # Guide: https://web.archive.org/web/20210731173012/https://github.community/t/required-check-is-expected-after-automated-push/187545/
          ssh-key: ${{ secrets.GHA_AUTOFIX_COMMIT_KEY }}

      - name: Execute pre-commit
        uses: pre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd # v3.0.1
        timeout-minutes: 45
        env:
          SKIP: no-commit-to-branch,manual-apply-warning
        with:
          extra_args: --color=always --show-diff-on-failure --files ${{ steps.file_changes.outputs.files }}

      # Need to trigger pre-commit workflow on autofix commit.
      - name: Push fixes
        if: failure()
        uses: EndBug/add-and-commit@a94899bca583c204427a224a7af87c02f9b325d5 # v9.1.4
        with:
        # Determines the way the action fills missing author name and email. Three options are available:
        # - github_actor -> UserName <UserName@users.noreply.github.com>
        # - user_info -> Your Display Name <your-actual@email.com>
        # - github_actions -> github-actions <email associated with the github logo>
        # Default: github_actor
          default_author: github_actor
        # The message for the commit.
        # Default: 'Commit from GitHub Actions (name of the workflow)'
          message: '[pre-commit] Autofix violations'

Regarding

On another note, isn't the python -m pip freeze --local bit just some debugging info that's printed to stdout? I'm not sure why it is in the example at all. I've removed it in our version and its just fine.

If it works without it, let's skip it addition then