CDCgov / prime-reportstream

ReportStream is a public intermediary tool for delivery of data between different parts of the healthcare ecosystem.
https://reportstream.cdc.gov
Creative Commons Zero v1.0 Universal
73 stars 40 forks source link

Importing JosiahSiegel GHA: checksum-validate-action@v1.5 #16157

Open emvaldes opened 1 month ago

emvaldes commented 1 month ago

Profile: JosiahSiegel Objective: Determine if test string checksum valid or invalid.

  1. Generate a checksum from either a string or shell command (use command substitution: $()).
  2. Validate if checksum is identical to input (even across multiple jobs), using a key to link the validation attempt with the correct generated checksum. a. Validation is possible across jobs since the checksum is uploaded as a workflow artifact

Target: checksum-validate-action@v1.5 : ebdf8c1 Latest: checksum-validate-action (806ce2fa215d520071c6d4faf8d2588a65e23749)

Note: Further development was made and not referenced/used in the project.

The checksum-validate-action is a GitHub Action designed to generate and validate checksums from strings or command outputs within your workflows. This functionality is particularly useful for ensuring data integrity and consistency across different stages of your pipeline.

Key Features:

Inputs:

Outputs:

Technical Evaluation:

The action is implemented as a composite action, executing a series of shell commands to perform checksum operations. The workflow includes the following steps:

  1. Checksum Generation: Utilizes the sha256sum command to generate a SHA-256 checksum from the provided input.
  2. Artifact Management:
    • Upload: If not in validation mode, the generated checksum is saved to a file and uploaded as a workflow artifact.
    • Download: In validation mode, the action downloads the previously uploaded checksum artifact for comparison.
  3. Validation: Compares the newly generated checksum with the downloaded artifact to determine consistency.
  4. Output and Failure Handling: Sets the valid output based on the comparison result and optionally fails the step if validation fails and fail-invalid is set to true.

Usage Example:

jobs:
  generate-checksums:
    name: Generate checksum
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4.1.1
      - name: Generate checksum of string
        uses: JosiahSiegel/checksum-validate-action@v1
        with:
          key: test string
          input: hello world
      - name: Generate checksum of command output
        uses: JosiahSiegel/checksum-validate-action@v1
        with:
          key: test command
          input: $(cat action.yml)

  validate-checksums:
    name: Validate checksum
    needs: generate-checksums
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4.1.1
      - name: Validate checksum of valid string
        id: valid-string
        uses: JosiahSiegel/checksum-validate-action@v1
        with:
          key: test string
          validate: true
          fail-invalid: true
          input: hello world
      - name: Validate checksum of valid command output
        id: valid-command
        uses: JosiahSiegel/checksum-validate-action@v1
        with:
          key: test command
          validate: true
          fail-invalid: true
          input: $(cat action.yml)
      - name: Get outputs
        run: |
          echo ${{ steps.valid-string.outputs.valid }}
          echo ${{ steps.valid-command.outputs.valid }}

Relevance to Your Pipeline:

If your pipeline involves scenarios where data integrity verification is crucial—such as ensuring that files or outputs remain unchanged across different stages or jobs—this action provides a straightforward method to implement such checks. It can help detect unintended modifications, ensuring consistency and reliability in your workflows. However, if your pipeline does not require such integrity checks or if similar validations are already implemented through other means, this action may be considered non-essential.

Conclusion:

The checksum-validate-action offers a practical solution for generating and validating checksums within GitHub workflows, enhancing data integrity and consistency. Its utility depends on your pipeline's specific requirements for data verification. Assessing your current processes for ensuring data integrity will help determine the action's relevance to your workflows.

emvaldes commented 1 month ago

This external repo is now inserted in the file structure at: .github/actions/checksum-validate-action in the importing-gha branch.

emvaldes commented 1 month ago
$GITHUB_STEP_SUMMARY
$GITHUB_OUTPUT
env.sha
github.sha
inputs.input
inputs.key
matrix.os
steps.input_sha.outputs.sha
steps.valid-command.outputs.valid
steps.valid-string.outputs.valid
steps.validate_checksum.outputs.valid
emvaldes commented 1 month ago

This GitHub Action (targeted to be imported as a remote/external) is no longer in consideration until we can further evaluate if they are worth the effort to be imported at a later stage.

Warning: I have placed it into the "IceBox" stage as it is out of scope for now.