Making checkov between 70% to 25% faster 🚀

Description

checkov always eagerly loads all runners and checks on all CLI invocations, regardless of whether they are needed or not. I guess this was not always an issue but currently it seems to add quite the overhead. For instance, a simple checkov --version or checkov --help on a 4-core 8Gb-memory Gitpod instance takes just over 2 seconds. Most of this time is spent importing the hundreds of python modules and checks.

If you agree that this is a welcomed improvement, I've done some digging into how this could be addressed and am proposing an incremental solution in this pull request. My results show reduced runtimes of:

around -45% reduction for simple CLI calls like checkov --version
around -40% reduction of single-framework runs (e.g. checkov --framework=ansible -d .)
very small improvement (-6%) even when for invocations that trigger all checks, which means that the lazy-loading logic does not provide any noticeable overhead.

The current changes in the pull request already pass all current unit tests but if this is a desired feature, I'll need to go over it more carefully to make sure it is ready for a full review.

Motivation

Firstly, it would be nice to not have to wait over 2 seconds for the output of checkov --help.

More seriously, when running multiple checkov checks in a CI/CD pipeline, the time checkov takes to load starts to add up, mainly when using checkov with pre-commit. Consider the following pre-commit config for example:

  - repo: https://github.com/bridgecrewio/checkov
    rev: 3.2.255
    hooks:
      - id: checkov_diff
        name: "checkov (ansible)"
        entry: checkov --framework ansible
        files: ^infra/tf/data/ansible/.+\.ya?ml$
      - id: checkov_diff
        name: "checkov (dockerfile)"
        entry: checkov --framework dockerfile
        files: (\.Dockerfile|Dockerfile)$
      - id: checkov_diff
        name: "checkov (github_actions)"
        entry: checkov --framework github_actions
        files: ^\.github/.+\.ya?ml$
      - id: checkov_diff
        name: "checkov (json, yaml)"
        entry: checkov --framework json,yaml
        files: \.(json|ya?ml)$
      - id: checkov_diff
        name: "checkov (terraform)"
        entry: checkov --framework terraform --var-file=infra/tf/terraform.tfvars
        files: ^infra/tf/.+\.(tf|tfvars)$
      - id: checkov_diff
        name: "checkov (k8s)"
        entry: checkov --framework kubernetes
        files: ^nfra/k8s/.+\.ya?ml$

Benchmarking

The following benchmarks were run both on my local machine (M3 mac) and on a 4-core/8Gb-memory Gitpod instance. The results can vary quite a bit on different Gitpod workspace instances, even when requesting the same resources. For this reason, the benchmarks bellow were run on the same workspace instance.

I used hyperfine to help me gather the benchmark statistics and pandas to correlate and render them in HTML.

Requirements

(expandable section)

After booting the Gitpod instance, I ran the following commands: ```shell # Setup Python env pyenv install 3.8.20 pyenv global 3.8.20 python3.8 -m pip install pipenv pipenv --python 3.8 pipenv install # Install hyperfine wget https://github.com/sharkdp/hyperfine/releases/download/v1.16.1/hyperfine_1.16.1_amd64.deb sudo dpkg -i hyperfine_1.16.1_amd64.deb # and pandas... pipenv install pandas ``` On my local machine (macOS) I also had to install hyperfine and pandas with: ```shell brew install hyperfine pipenv install pandas ```

Isolating the effect of the change

(expandable section)

The changes proposed in this pull request should not have any impact on the actual execution of the checks and checkov _Runners_. The effects are only present **before** the runs are triggered. You can verify this yourself by running some local tests. To properly isolate the behaviour changed in this PR and remove any extra sources of noise, I patched the `BaseRunner.run()` to simply return an empty `Report` right away. This can be achieved by creating a new entry point under `checkov/main_patched.py` with the following code: ```python from __future__ import annotations from typing import TypeVar, Generic import checkov.common.runners.base_runner from checkov.common.output.report import Report _Context = TypeVar("_Context", bound="dict[Any, Any]|None") _Definitions = TypeVar("_Definitions", bound="dict[Any, Any]|None") _GraphManager = TypeVar("_GraphManager", bound="GraphManager[Any, Any]|None") class PatchedBaseRunner( checkov.common.runners.base_runner.BaseRunner, Generic[_Definitions, _Context, _GraphManager] ): def __new__(cls, *args, **kwargs): def noop_run(self, *args, **kwargs): return Report(check_type=self.check_type) cls.run = noop_run return super().__new__(cls, *args, **kwargs) checkov.common.runners.base_runner.BaseRunner = PatchedBaseRunner if __name__ == '__main__': from checkov.main import Checkov import sys ckv = Checkov() sys.exit(ckv.run()) ``` Test that it works by running: ```shell pipenv run python -m checkov.main_patched --version ```

Running the benchmark

(expandable section)

I executed the following script to generate the results displayed in the next section: ```shell #!/usr/bin/env bash set -o errexit # Exit on error set -o pipefail # Use last non-zero exit code in a pipeline set -o nounset # Disallow expansion of unset variables cmd='python -m checkov.main_patched' common_checkov_flags='--quiet --compact --skip-results-upload --skip-download' bench(){ local branch="${1}" local csv_file="bench_checkov_${branch}.csv" git checkout "${branch}" echo "Benchmarking ${branch}..." hyperfine \ --warmup 3 \ --min-runs 10 \ --max-runs 20 \ --time-unit second \ --shell=none \ --export-csv "${csv_file}" \ "${cmd} ${common_checkov_flags} --version" \ "${cmd} ${common_checkov_flags} --list" \ "${cmd} ${common_checkov_flags} --framework=openapi -d tests/openapi/" \ "${cmd} ${common_checkov_flags} --framework=ansible -d tests/ansible/examples/" \ "${cmd} ${common_checkov_flags} --framework=terraform -d tests/terraform/checks/data" \ "${cmd} ${common_checkov_flags} -d tests/" sed -i "s|${cmd} ${common_checkov_flags} ||g" "${csv_file}" sed -i 's|command|argv|g' "${csv_file}" } bench "main" bench "lazy-cherry" python <{x}") # Break up long lines df.index = df.index.str.replace(" -d ", "

-d ")
print(df.to_markdown(tablefmt="github"))
df.to_html("bench_checkov.html", border=0, escape=False, justify="left")
EOF

# Fix table's text-alignment
sed -i 's|||g' bench_checkov.html
sed -i 's|||g' bench_checkov.html

echo "HTML table was saved to ./bench_checkov.html"
```


Results
M3 - macOS

  
    
      
      Eager
      Lazy
      pct
    
    
      argv
      mean
      stddev
      mean
      stddev
      mean
    
  
  
    
      --version
      0.999
      0.034
      0.552
      0.013
      🔻-45%
    
    
      --list
      2.156
      0.068
      1.878
      0.057
      🔻-13%
    
    
      --framework=openapi
-d tests/openapi/
      0.940
      0.022
      0.597
      0.030
      🔻-36%
    
    
      --framework=ansible
-d tests/ansible/examples/
      0.941
      0.037
      0.593
      0.018
      🔻-37%
    
    
      --framework=terraform
-d tests/terraform/checks/data
      0.953
      0.032
      0.767
      0.015
      🔻-20%
    
    
      -d tests/
      0.989
      0.008
      0.929
      0.022
      🔻-6%
    
  

4-core/8Gb-memory Gitpod instance

  
    
      
      Eager
      Lazy
      pct
    
    
      argv
      mean
      stddev
      mean
      stddev
      mean
    
  
  
    
      --version
      2.103
      0.022
      1.242
      0.040
      🔻-41%
    
    
      --list
      3.643
      0.106
      3.342
      0.051
      🔻-8%
    
    
      --framework=openapi
-d tests/openapi/
      2.177
      0.065
      1.273
      0.026
      🔻-42%
    
    
      --framework=ansible
-d tests/ansible/examples/
      2.166
      0.059
      1.245
      0.037
      🔻-43%
    
    
      --framework=terraform
-d tests/terraform/checks/data
      2.113
      0.060
      1.759
      0.037
      🔻-17%
    
    
      -d tests/
      2.242
      0.035
      2.112
      0.042
      🔻-6%

	Eager	Lazy	pct
`--version`	0.999	0.034	0.552	0.013	🔻-45%
`--list`	2.156	0.068	1.878	0.057	🔻-13%
`--framework=openapi` `-d tests/openapi/`	0.940	0.022	0.597	0.030	🔻-36%
`--framework=ansible` `-d tests/ansible/examples/`	0.941	0.037	0.593	0.018	🔻-37%
`--framework=terraform` `-d tests/terraform/checks/data`	0.953	0.032	0.767	0.015	🔻-20%
`-d tests/`	0.989	0.008	0.929	0.022	🔻-6%

	Eager	Lazy	pct
`--version`	2.103	0.022	1.242	0.040	🔻-41%
`--list`	3.643	0.106	3.342	0.051	🔻-8%
`--framework=openapi` `-d tests/openapi/`	2.177	0.065	1.273	0.026	🔻-42%
`--framework=ansible` `-d tests/ansible/examples/`	2.166	0.059	1.245	0.037	🔻-43%
`--framework=terraform` `-d tests/terraform/checks/data`	2.113	0.060	1.759	0.037	🔻-17%
`-d tests/`	2.242	0.035	2.112	0.042	🔻-6%

	Eager	Lazy	pct
`--version`	0.999	0.034	0.287	0.002	🔻-71%
`--list`	2.156	0.068	1.664	0.010	🔻-23%
`--framework=openapi` `-d tests/openapi/`	0.940	0.022	0.303	0.013	🔻-68%
`--framework=ansible` `-d tests/ansible/examples/`	0.941	0.037	0.299	0.002	🔻-68%
`--framework=terraform` `-d tests/terraform/checks/data`	0.953	0.032	0.543	0.010	🔻-43%
`-d tests/`	0.989	0.008	0.755	0.005	🔻-24%

@tpvasconcelos First of all, thank you for this amazing suggestion 💯 I completely agree with you that commands like checkov --help shouldn't be affected from the many imports we have to load all python modules.

However, I'm hesitant about your specific PR as it introduces a specific mechanism of lazy loading with references to the python internal mechanisms (like references to the stack-frames). The main drawbacks from our side for using those mechanisms are:

Adding extra complexity to the project, which requires specific understanding of the importing mechanism.
From my experience, affecting the imports in such a way might cause unintended side effects. For example, I believe this might cause debugging to be harder in checkov as it will add the code for "computing" the lazy imports as part of the stack-frames.

Instead, I think you raised an important point which is the fact that we don't need to import all of the code of the checks for a lot of usages in checkov. What I suggest, is trying to remove the imports to all of the checks from the relevant __init__ files and test the performance in this case. It might require following the imports in other locations as well to make sure there is no other reference to them.

@tpvasconcelos Is that something you are willing to consider/try to implement?

bridgecrewio / checkov