microsoft / PSRule

Validate infrastructure as code (IaC) and objects using PowerShell rules.
https://microsoft.github.io/PSRule/v2/
MIT License
390 stars 51 forks source link

Add support for processing files based on git diff #688

Closed BernieWhite closed 1 year ago

BernieWhite commented 3 years ago

When a pull request is opened it may be desirable to limit checks to files included in the PR.

This would be desirable when externally new rules have been introduced that now cause existing files already in the repository to fail.

michaeltlombardi commented 2 years ago

As a temporary workaround until this feature can be implemented, it might be useful to document filtering files by the git diff information in CI or by using the output from a prior step, like this changed-files action.

If that sounds like a good idea, I'd be happy to take a poke at a PR to add that example.

BernieWhite commented 2 years ago

@michaeltlombardi Absolutely. That would be appreciated. If you want to start a PR, I'd suggest to add a new section to: docs/creating-your-pipeline.md after configuration.

BernieWhite commented 2 years ago

Another thing to consider is how we handling complex relationships with changed files. While this feature may not resolve this entirely be itself.

For example:

With the changes files of:

modules/storage/v1/main.bicep

This file may be changed, however is excluded via input.pathIgnore. The file modules/storage/v1/.tests/main.tests.bicep is not excluded and references modules/storage/v1/main.bicep.

Really we want to process the whole module directory modules/storage/v1/ if any files are changed.

michaeltlombardi commented 2 years ago

That's a great point. What is probably best practice is to use something like the changed-files action or parsing git diff --name-status <compare_target_ref> to see file change types and process those for your specific use case.

Sorry for dropping the ball on this, I cleared the notification at some point and never remembered to circle back.

For this specific use case (caveat: I have never used bicep, I'm gesturing at correctness here because I lack context), I think something like this would work:

name: Analyze templates
on:
  push:
    branches:
    - main
  pull_request:
    branches:
    - main
jobs:
  analyze_arm:
    name: Analyze templates
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v3
      with:
        fetch-depth: 2 # Retrieve current and preceding commit
    # Check for file changes in modules/storage/v1
    - name: Analyze File Changes
      id: changed-files-modules-storage
      uses: tj-actions/changed-files@v23.2
      with:
        files: modules/storage/v1
    # Analyze Azure resources using PSRule for Azure only if any files changed
    - name: Analyze Azure template files
      uses: microsoft/ps-rule@v2.1.0
      if: steps.changed-files-modules-storage.outputs.any_changed == 'true'
      with:
        modules: 'PSRule.Rules.Azure'

Another example is a hypothetical "All commands in a module must have appropriate reference documentation" requirement. For this repo, that might look like:

name: Analyze Command Documentation
on:
  push:
    branches:
    - main
  pull_request:
    branches:
    - main
jobs:
  analyze_cmdlet_docs:
    name: Analyze Command Documentation
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v3
      with:
        fetch-depth: 2 # Retrieve current and preceding commit
    # Check for command source and docs changes
    - name: Analyze File Changes
      id: changed-files-commands
      uses: tj-actions/changed-files@v23.2
      with:
        files: |
          src/**/Commands
          docs/commands
    # Analyze documentation to see if all commands have full documentation
    # that matches their implementation details.
    - name: Analyze Azure template files
      uses: microsoft/ps-rule@v2.1.0
      if: steps.changed-files-commands.outputs.any_changed == 'true'
      with:
        modules: 'PSRule.Rules.PSCmdletDocs' # Hypothetical

I don't see a way around having to define steps for everything you want to control the changed-file behavior for. Combining these two rules would look like:

name: Analyze
on:
  push:
    branches:
    - main
  pull_request:
    branches:
    - main
jobs:
  analyze:
    name: Analyze
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v3
      with:
        fetch-depth: 2 # Retrieve current and preceding commit
    # Check for command source and docs changes
    - name: Analyze Command File Changes
      id: changed-files-commands
      uses: tj-actions/changed-files@v23.2
      with:
        files: |
          src/**/Commands
          docs/commands
    # Check for file changes in modules/storage/v1
    - name: Analyze Azure File Changes
      id: changed-files-modules-storage
      uses: tj-actions/changed-files@v23.2
      with:
        files: modules/storage/v1
    # Analyze documentation to see if all commands have full documentation
    # that matches their implementation details.
    - name: Analyze Azure template files
      uses: microsoft/ps-rule@v2.1.0
      if: steps.changed-files-commands.outputs.any_changed == 'true'
      with:
        modules: 'PSRule.Rules.PSCmdletDocs' # Hypothetical
    # Analyze Azure resources using PSRule for Azure only if any files changed
    - name: Analyze Azure template files
      uses: microsoft/ps-rule@v2.1.0
      if: steps.changed-files-modules-storage.outputs.any_changed == 'true'
      with:
        modules: 'PSRule.Rules.Azure' 

More likely, it would be better practice to define multiple jobs or workflows (especially in this case as command docs and the template files aren't related).

Given this though, I think I see a path towards defining "files to check for changes" in a rule definition as an optional key but it would likely require implementing the files-changed check logic in PowerShell.

BernieWhite commented 2 years ago

@michaeltlombardi Some of these might work with the PSRule.Data.RepositoryInfo object that is emitted, that can be used to check for existence of files within the repository.

https://github.com/microsoft/PSRule.Rules.MSFT.OSS/blob/0fd68783b9ff0c241374a00f8c15d5233ea88e8c/src/PSRule.Rules.MSFT.OSS/MSFT.OSS.Rule.ps1#L4-L14

BernieWhite commented 2 years ago

Cross referencing #1095

BernieWhite commented 2 years ago

@michaeltlombardi Based on your suggestion.

  1. Provide an option for PSRule to limit input files to changes files. Such as by calling git diff --diff-filter=d --no-renames HEAD^ HEAD. Ideally the filter would be configurable.
  2. Provide a method, via a PSRule convention to expand the list. Once change files are known, it should be fairly easy to include files within a path. Including documentation or bicep code.

A custom convention might look like this:

Export-PSRuleConvention 'AddModuleFiles' -Initialize {
    foreach ($inputFile in $PSRule.Repository.GetChangedFiles()) {
        if ($inputFile.Extension -eq '.bicep') {
            # Get module path
            $modulePath = $inputFile.AsFileInfo().Directory;
            while (!$modulePath.Name.StartsWith('v')) {
                $modulePath = $modulePath.Parent;
            }
            $moduleVersion = $modulePath.Name;
            $moduleName = $modulePath.Parent.Name;

            # Add whole module path to input files
            $PSRule.Input.Add($modulePath.FullName);

            # Add matching docs
            $PSRule.Input.Add("docs/modules/$moduleName-$moduleVersion/");
        }
    }
}