terraform-compliance / cli

a lightweight, security focused, BDD test framework against terraform.
https://terraform-compliance.com
MIT License
1.36k stars 151 forks source link

- silent output with the added Feature Description and feature path for the context and fail troubleshooting #388

Open mdesmarest opened 4 years ago

mdesmarest commented 4 years ago

Feature Request

Feature description : We would greatly benefit from a slightly more ROBUST output than what the -S parameter allows for, an output like : . Converting terraform plan file. Failure: tcp/22 port is defined within 0.0.0.0/0 network in aws_security_group_rule.ssh_yolo. Failure: tcp/3389 port is defined within 0.0.0.0/0 network in aws_security_group_rule.RDP_YOLO. Failure: aws_instance.missing_rootblock (aws_instance) does not have root_block_device property. Failure: Resource aws_instance.nonencrypted_Root_ebs_too does not have its encrypted property enabled (its encrypted=[]). Failure: Resource aws_instance.missing_rootblock does not have its encrypted property enabled (its encrypted=[]). Failure: Resource aws_instance.nonencrypted_Root_ebs_too does not have its encrypted property enabled (its encrypted=[]). 2 features (0 passed, 2 failed) 5 scenarios (0 passed, 5 failed) 7 steps (2 passed, 5 failed) and the full output:

` Feature: Ensure ingress security group rules do not contain unfiltered access on vulnerable ports # /Users/mdesmarest/repos/terraform-compliance-features/features/security_group_rule_ingress_all_0s.feature

Background: Check for presence of security_group_rule(s) and type = ingress
    Given I have aws_security_group_rule defined
    When its type is ingress

Scenario: Check tcp port 22 (ssh) for unfiltered ingress access
    Given I have aws_security_group_rule defined
    When its type is ingress
    Failure: tcp/22 port is defined within 0.0.0.0/0 network in aws_security_group_rule.ssh_yolo.
    Then it must not have tcp protocol and port 22 for 0.0.0.0/0
      Failure: 

Scenario: Check tcp port 3389 (rdp) for unfiltered ingress access
    Given I have aws_security_group_rule defined
    When its type is ingress
    Failure: tcp/3389 port is defined within 0.0.0.0/0 network in aws_security_group_rule.RDP_YOLO.
    Then it must not have tcp protocol and port 3389 for 0.0.0.0/0
      Failure: 

Feature: Ensure all aws instances aside from bastion hosts, root/ebs block volumes are configured and encrypted at rest upon launch # /Users/mdesmarest/repos/terraform-compliance-features/features/unencrypted_block_device.feature

Background: Check to make sure EC2 instances are in the configuration and not named bastion
    Given I have aws_instance defined
    When its name is not bastion

Scenario: Ensure root_block_device is configured
    Given I have aws_instance defined
    When its name is not bastion
    Failure: aws_instance.missing_rootblock (aws_instance) does not have root_block_device property.
    Then it must have root_block_device
      Failure: 

Scenario: Encryption flag defined and set to true for all root_block_device(s) on launch
    Given I have aws_instance defined
    When its name is not bastion
    When it has root_block_device
    Failure: Resource aws_instance.nonencrypted_Root_ebs_too does not have its encrypted property enabled (its encrypted=[]).
    Then its encrypted must be enabled
      Failure: 

Scenario: Encryption flag defined and set to true for all ebs_block_device(s) on launch
    Given I have aws_instance defined
    When its name is not bastion
    When it has ebs_block_device
    Failure: Resource aws_instance.missing_rootblock does not have its encrypted property enabled (its encrypted=[]).
    Failure: Resource aws_instance.nonencrypted_Root_ebs_too does not have its encrypted property enabled (its encrypted=[]).
    Then its encrypted must be enabled
      Failure: 

2 features (0 passed, 2 failed) 5 scenarios (0 passed, 5 failed) 7 steps (2 passed, 5 failed) Run 1602783865 finished within a moment

Suggested Solution description ( if you have any ) : We are currently running Terraform-Compliance in with Atlantis and piggybacking on the Atlantis gitbot output. We would like to keep output to a minimum but a bit more verbose than the fails.

We are using this within the PR and atlantis check Terraform plan output to guide engineers to correct compliance errors and would like to be able to guide them to the specific feature they are failing on via link, and also guide to fixes

The most Ideal option would be to add another parameter a bit less spartan than -S an output like:

`Feature: Ensure ingress security group rules do not contain unfiltered access on vulnerable ports # /Users/mdesmarest/repos/terraform-compliance-features/features/security_group_rule_ingress_all_0s.feature

Failure: tcp/22 port is defined within 0.0.0.0/0 network in aws_security_group_rule.ssh_yolo. Failure: tcp/3389 port is defined within 0.0.0.0/0 network in aws_security_group_rule.RDP_YOLO.`

Possibly without the feature pass/fail box below: 2 features (0 passed, 2 failed) 5 scenarios (0 passed, 5 failed) 7 steps (2 passed, 5 failed)

Alternatively instead of JUST the failing message, a true "quiet mode" that only passes the features and scenarios that did fail along with what flagged them as -S does would be totally acceptable as well

eerkunt commented 4 years ago

You are right, its problematic.

I will have a look on this one.

Kudbettin commented 4 years ago

Hi @mdesmarest

I noticed this issue after opening #398, which will ommit Passing/Skipping Scenarios from the output but still have the following format for failing scenarios:

Feature: Feature for issue 343  # /path/to/features/

    Scenario: Failing Scenario 1
        Given I have resource that supports tags defined
        Then it must have tags
        And its value must not be null
          Failure: tags property in aws_s3_bucket.c is considered to be Null. It is set to {}.

    Scenario: Failing Scenario 2
        Given I have resource that supports tags defined
        Then it must have tags
        And its value must be bad
          Failure: tags property in aws_s3_bucket.a resource does not match with ^bad$ case insensitive regex. It is set to none.
      Failure:  tags property in aws_s3_bucket.a resource does not match with ^bad$ case insensitive regex. It is set to ok.
          Failure:  tags property in aws_s3_bucket.b resource does not match with ^bad$ case insensitive regex. It is set to .
      Failure:  tags property in aws_s3_bucket.b resource does not match with ^bad$ case insensitive regex. It is set to ok.

2 features (0 passed, 2 skipped)
8 scenarios (2 passed, 4 failed, 2 skipped)
20 steps (14 passed, 4 failed, 2 skipped)
Run 1603841902 finished within a moment

This might still be a bit too verbose for your use case, as it's going to be printing the entire scenario if it fails.

Would something along the lines of

Feature: Feature for issue 343  # /path/to/features/
    Scenario: Failing Scenario 1
          Failure: tags property in aws_s3_bucket.c is considered to be Null. It is set to {}.

    Scenario: Failing Scenario 2
          Failure: tags property in aws_s3_bucket.a resource does not match with ^bad$ case insensitive regex. It is set to none.
          Failure:  tags property in aws_s3_bucket.a resource does not match with ^bad$ case insensitive regex. It is set to ok.
          Failure:  tags property in aws_s3_bucket.b resource does not match with ^bad$ case insensitive regex. It is set to .
          Failure:  tags property in aws_s3_bucket.b resource does not match with ^bad$ case insensitive regex. It is set to ok.

2 features (0 passed, 2 skipped)
8 scenarios (2 passed, 4 failed, 2 skipped)
20 steps (14 passed, 4 failed, 2 skipped)
Run 1603841902 finished within a moment

Be more useful?

Alternatives could be:

Feature: Feature for issue 343  # /path/to/features/
    Scenario: Failing Scenario 1
    Scenario: Failing Scenario 2

2 features (0 passed, 2 skipped)
8 scenarios (2 passed, 4 failed, 2 skipped)
20 steps (14 passed, 4 failed, 2 skipped)
Run 1603841902 finished within a moment

Or

2 features (0 passed, 2 skipped)
8 scenarios (2 passed, 4 failed, 2 skipped)
20 steps (14 passed, 4 failed, 2 skipped)
Run 1603841902 finished within a moment

Customizing the -S flag can be possible

mdesmarest commented 4 years ago

@Kudbettin Thanks so much for the reply,

This type of output is muchShowing the failing feature and its description(which allows us to guide users to the correct compliance), The scenario it failed on(omitting the steps) and the failures on each scenario but withholding the STEPS is a huge improvement.

Preferred output option among choices(it would be perfect filter out the feature path and the features/scenarios/steps pass/fail at the bottom and allow us to leave in the verbose feature description):

Feature: Feature for issue 343  # /path/to/features/
    Scenario: Failing Scenario 1
          Failure: tags property in aws_s3_bucket.c is considered to be Null. It is set to {}.

    Scenario: Failing Scenario 2
          Failure: tags property in aws_s3_bucket.a resource does not match with ^bad$ case insensitive regex. It is set to none.
          Failure:  tags property in aws_s3_bucket.a resource does not match with ^bad$ case insensitive regex. It is set to ok.
          Failure:  tags property in aws_s3_bucket.b resource does not match with ^bad$ case insensitive regex. It is set to .
          Failure:  tags property in aws_s3_bucket.b resource does not match with ^bad$ case insensitive regex. It is set to ok.

2 features (0 passed, 2 skipped)
8 scenarios (2 passed, 4 failed, 2 skipped)
20 steps (14 passed, 4 failed, 2 skipped)
Run 1603841902 finished within a moment

The option to output failures that just include the feature they fail on, OR the feature/scenario(minus the steps) and the failure may also prove to be fruitful depending on context. Having output flagging for each would be ideal. We are running the instructions for terraform compliance via the atlantis yaml and a one liner on the plan step. Since we are in a beta state of this project the decision of how to emphasize 1 scenario per feature and have a larger amount of feature files , or several scenarios per feature(thus making the feature output a bit less clear, but rely on the scenario description to get more explicit. Some of the teams we have using terraform need a bit more contextual instruction than others. None of these options are worth considering without better output filtering.

Stylistically, we are trying to find a sweet spot for output, since in our implementation we are writing the terraform-compliance output in the same github pull request messagemessage that the terraform plan is outputted to from Atlantis. We have been filtering output via awk to see where the best clarity lies. However due to the fact that the current output would require some complex conditional filtering since our trigger for filtering out feature and scenario hinge on what fails.

For us the feature path doesn't provide a clear guide and makes extra noise since its cloned via github, we tend to filter out based on the # being the EOL character, thus losing the context in a multi line feature description(This is why we asked if the whole feature path can be filtered out)

We are at a DevOps maturity where we are trying to guide users to use our modules for standard configurations, in this case s3 buckets. The issue arises that we can't flag for not using a module, or flag any of the resources created in a way that will not flag on existing resources flagging, for things that appear in a plan like adding tags. We do not want to fail on a feature and have the solution be to use a module on a bucket that was already created without it. This is for new configs. Thus we use a good chunk of the feature line to guide and advise for use if our module, since we can't account for resources that are being created vs being updated.

How we write features and scenarios is directly influenced on the needed lines we output. The desire is to be as clear as possible with as few lines of output. Currently we are running with silent mode and trying to make the failures be as verbose as possible. The output is very clear to someone familiar with terraform but not so much for the more junior dev.

For example:

This can reduce our total lines outputted but provides less clarity as to what option they should choose for encryption, either aws:kms or AES256 . This may result in having a user fail 2 passes since they may choose the wrong algorithm after terraform self solves you through failing plan until you get to the sse_algorithm option.

Feature: aws_s3_bucket resources must be encrypted Please use CORP/tf-module-s3-bucket module for s3 buckets  
        Failure: sse_algorithm property in aws_s3_bucket.sse_algorithm_awskms resource does not match with ^AES256$ case insensitive regex. It is set to aws:kms.
        Failure: aws_s3_bucket.missing_Server_side_encryption (aws_s3_bucket) does not have server_side_encryption_configuration property.

This example can allow us to cheat a bit and guide someone through more verbose output via the scenario giving more context to things that are unconventional to drill down to.

Feature: aws_s3_bucket resources must be encrypted Please use CORP/tf-module-s3-bucket module for s3 buckets  
    Scenario: sse_algorithm must be set to AES26 Use CORP/tf-module-s3-bucket module for s3 buckets
        Failure: sse_algorithm property in aws_s3_bucket.sse_algorithm_awskms resource does not match with ^AES256$ case insensitive regex. It is set to aws:kms.
    Scenario: s3 bucket encryption must be enabled and sse_algorithm = "AES256"
        Failure: aws_s3_bucket.missing_Server_side_encryption (aws_s3_bucket) does not have server_side_encryption_configuration property.

This third option represents the Feature/fail style in a different manner, the goal being to guide a user to the correct configuration, but have to create more features to account for where resources fail:

Feature: s3 bucket must have server_side_encryption_configuration Please use CORP/tf-module-s3-bucket module for s3 buckets  
        Failure: aws_s3_bucket.missing_Server_side_encryption (aws_s3_bucket) does not have server_side_encryption_configuration property.
Feature: s3 bucket encryption set to AES256 Please use CORP/tf-module-s3-bucket module for s3 buckets  
        Failure: sse_algorithm property in aws_s3_bucket.sse_algorithm_awskms resource does not match with ^AES256$ case insensitive regex. It is set to aws:kms.
Feature: s3 buckets must have encrpytion and sse_algoritm=AES256 Please use CORP/tf-module-s3-bucket module for s3 buckets  
        Failure: aws_s3_bucket.missing_Server_side_encryption (aws_s3_bucket) does not have sse_algorithm property.

We are unconcerned with showing any context other than failures so feature/failure and feature/scenario/failure only gives flexibility to design our ideal deploy pattern

We are essentially using terraform compliance as a compliance linter rather than a bdd tool. passing all features should be almost invisible to our users.

As stated above, we are implementing a series of modules to standardize configuration, and in cases of aws s3 buckets there may be more than one bucket being created, thus we can't leverage a static "address" value in when and the when its address is "value" can't be regex'd(cool if this could be done) to account for different module blocks changing the address field. Using referenced is quite cool but poses some difficulty going forward and without a more verbose output, the current -S mode does not provide good context.

Feature: Ensure S3 buckets are created using the tf-module-s3-bucket module
    (Module Ref: https://github.com/xxx/tf-module-s3-bucket)
    (Ref: https://github.com/xxx/terraform-compliance-features/blob/master/features/s3_module_test.feature)

  Scenario: ensure all s3 buckets are created using a module
    Given I have aws_s3_bucket defined
    # This flags when the address path for an s3 bucket does not start with "module."
    # resources created without a module lead with their resource type first
    # Please see above  in feature for 2U provided s3 bucket module and use this for configuration
    Then it must have "module." referenced

The failure message on its own:

"Failure: module. is not referenced within aws_s3_bucket.all_good." is not very contextual as to why, and can get even more confusing without the feature and its stated use, and the scenario it failed on

mdesmarest commented 4 years ago

@Kudbettin new silent mode works great, Thanks!

eerkunt commented 3 years ago

Can we close this issue ? 🤔