Prevent cloud misconfigurations and find vulnerabilities during build-time in infrastructure as code, container images and open source packages with Checkov by Bridgecrew.
checkov always eagerly loads all runners and checks on all CLI invocations, regardless of whether they are needed or not. I guess this was not always an issue but currently it seems to add quite the overhead. For instance, a simple checkov --version or checkov --help on a 4-core 8Gb-memory Gitpod instance takes just over 2 seconds. Most of this time is spent importing the hundreds of python modules and checks.
If you agree that this is a welcomed improvement, I've done some digging into how this could be addressed and am proposing an incremental solution in this pull request. My results show reduced runtimes of:
around -45% reduction for simple CLI calls like checkov --version
around -40% reduction of single-framework runs (e.g. checkov --framework=ansible -d .)
very small improvement (-6%) even when for invocations that trigger all checks, which means that the lazy-loading logic does not provide any noticeable overhead.
The current changes in the pull request already pass all current unit tests but if this is a desired feature, I'll need to go over it more carefully to make sure it is ready for a full review.
Motivation
Firstly, it would be nice to not have to wait over 2 seconds for the output of checkov --help.
More seriously, when running multiple checkov checks in a CI/CD pipeline, the time checkov takes to load starts to add up, mainly when using checkov with pre-commit. Consider the following pre-commit config for example:
The following benchmarks were run both on my local machine (M3 mac) and on a 4-core/8Gb-memory Gitpod instance. The results can vary quite a bit on different Gitpod workspace instances, even when requesting the same resources. For this reason, the benchmarks bellow were run on the same workspace instance.
I used hyperfine to help me gather the benchmark statistics and pandas to correlate and render them in HTML.
Requirements
(expandable section)
After booting the Gitpod instance, I ran the following commands:
```shell
# Setup Python env
pyenv install 3.8.20
pyenv global 3.8.20
python3.8 -m pip install pipenv
pipenv --python 3.8
pipenv install
# Install hyperfine
wget https://github.com/sharkdp/hyperfine/releases/download/v1.16.1/hyperfine_1.16.1_amd64.deb
sudo dpkg -i hyperfine_1.16.1_amd64.deb
# and pandas...
pipenv install pandas
```
On my local machine (macOS) I also had to install hyperfine and pandas with:
```shell
brew install hyperfine
pipenv install pandas
```
Isolating the effect of the change
(expandable section)
The changes proposed in this pull request should not have any impact on the actual execution of the checks and checkov _Runners_. The effects are only present **before** the runs are triggered. You can verify this yourself by running some local tests.
To properly isolate the behaviour changed in this PR and remove any extra sources of noise, I patched the `BaseRunner.run()` to simply return an empty `Report` right away.
This can be achieved by creating a new entry point under `checkov/main_patched.py` with the following code:
```python
from __future__ import annotations
from typing import TypeVar, Generic
import checkov.common.runners.base_runner
from checkov.common.output.report import Report
_Context = TypeVar("_Context", bound="dict[Any, Any]|None")
_Definitions = TypeVar("_Definitions", bound="dict[Any, Any]|None")
_GraphManager = TypeVar("_GraphManager", bound="GraphManager[Any, Any]|None")
class PatchedBaseRunner(
checkov.common.runners.base_runner.BaseRunner,
Generic[_Definitions, _Context, _GraphManager]
):
def __new__(cls, *args, **kwargs):
def noop_run(self, *args, **kwargs):
return Report(check_type=self.check_type)
cls.run = noop_run
return super().__new__(cls, *args, **kwargs)
checkov.common.runners.base_runner.BaseRunner = PatchedBaseRunner
if __name__ == '__main__':
from checkov.main import Checkov
import sys
ckv = Checkov()
sys.exit(ckv.run())
```
Test that it works by running:
```shell
pipenv run python -m checkov.main_patched --version
```
Running the benchmark
(expandable section)
I executed the following script to generate the results displayed in the next section:
```shell
#!/usr/bin/env bash
set -o errexit # Exit on error
set -o pipefail # Use last non-zero exit code in a pipeline
set -o nounset # Disallow expansion of unset variables
cmd='python -m checkov.main_patched'
common_checkov_flags='--quiet --compact --skip-results-upload --skip-download'
bench(){
local branch="${1}"
local csv_file="bench_checkov_${branch}.csv"
git checkout "${branch}"
echo "Benchmarking ${branch}..."
hyperfine \
--warmup 3 \
--min-runs 10 \
--max-runs 20 \
--time-unit second \
--shell=none \
--export-csv "${csv_file}" \
"${cmd} ${common_checkov_flags} --version" \
"${cmd} ${common_checkov_flags} --list" \
"${cmd} ${common_checkov_flags} --framework=openapi -d tests/openapi/" \
"${cmd} ${common_checkov_flags} --framework=ansible -d tests/ansible/examples/" \
"${cmd} ${common_checkov_flags} --framework=terraform -d tests/terraform/checks/data" \
"${cmd} ${common_checkov_flags} -d tests/"
sed -i "s|${cmd} ${common_checkov_flags} ||g" "${csv_file}"
sed -i 's|command|argv|g' "${csv_file}"
}
bench "main"
bench "lazy-cherry"
python <{x}")
# Break up long lines
df.index = df.index.str.replace(" -d ", " -d ")
print(df.to_markdown(tablefmt="github"))
df.to_html("bench_checkov.html", border=0, escape=False, justify="left")
EOF
# Fix table's text-alignment
sed -i 's|
|
|g' bench_checkov.html
sed -i 's|
|
|g' bench_checkov.html
echo "HTML table was saved to ./bench_checkov.html"
```
If I run the pre-commit hook defined in the Motivation section above in a private repository of mine, it only takes 1.38 seconds when running from this branch. While it takes a 2.77 seconds when running against the latest v3.2.255 release. i.e., -50% improvement
Hey @gruebel @tsmithv11! Apologies for pinging you directly here but I wanted to check whether this is something you'd be interested in for the checkov project.
In short, the patch I'm suggesting makes checkov between 70% faster (when executing simple commands like checkov --help) but also as much as 25% faster when running the much more expensive checkov -d . command.
If this is something that interests your team, I'd be happy to provide more context where needed, run additional benchmarks, and make the PR ready for review.
@tpvasconcelos First of all, thank you for this amazing suggestion 💯
I completely agree with you that commands like checkov --help shouldn't be affected from the many imports we have to load all python modules.
However, I'm hesitant about your specific PR as it introduces a specific mechanism of lazy loading with references to the python internal mechanisms (like references to the stack-frames).
The main drawbacks from our side for using those mechanisms are:
Adding extra complexity to the project, which requires specific understanding of the importing mechanism.
From my experience, affecting the imports in such a way might cause unintended side effects. For example, I believe this might cause debugging to be harder in checkov as it will add the code for "computing" the lazy imports as part of the stack-frames.
Instead, I think you raised an important point which is the fact that we don't need to import all of the code of the checks for a lot of usages in checkov.
What I suggest, is trying to remove the imports to all of the checks from the relevant __init__ files and test the performance in this case. It might require following the imports in other locations as well to make sure there is no other reference to them.
@tpvasconcelos Is that something you are willing to consider/try to implement?
Description
checkov always eagerly loads all runners and checks on all CLI invocations, regardless of whether they are needed or not. I guess this was not always an issue but currently it seems to add quite the overhead. For instance, a simple
checkov --version
orcheckov --help
on a 4-core 8Gb-memory Gitpod instance takes just over 2 seconds. Most of this time is spent importing the hundreds of python modules and checks.If you agree that this is a welcomed improvement, I've done some digging into how this could be addressed and am proposing an incremental solution in this pull request. My results show reduced runtimes of:
checkov --version
checkov --framework=ansible -d .
)The current changes in the pull request already pass all current unit tests but if this is a desired feature, I'll need to go over it more carefully to make sure it is ready for a full review.
Motivation
Firstly, it would be nice to not have to wait over 2 seconds for the output of
checkov --help
.More seriously, when running multiple checkov checks in a CI/CD pipeline, the time checkov takes to load starts to add up, mainly when using checkov with pre-commit. Consider the following pre-commit config for example:
Benchmarking
The following benchmarks were run both on my local machine (M3 mac) and on a 4-core/8Gb-memory Gitpod instance. The results can vary quite a bit on different Gitpod workspace instances, even when requesting the same resources. For this reason, the benchmarks bellow were run on the same workspace instance.
I used hyperfine to help me gather the benchmark statistics and pandas to correlate and render them in HTML.
After booting the Gitpod instance, I ran the following commands: ```shell # Setup Python env pyenv install 3.8.20 pyenv global 3.8.20 python3.8 -m pip install pipenv pipenv --python 3.8 pipenv install # Install hyperfine wget https://github.com/sharkdp/hyperfine/releases/download/v1.16.1/hyperfine_1.16.1_amd64.deb sudo dpkg -i hyperfine_1.16.1_amd64.deb # and pandas... pipenv install pandas ``` On my local machine (macOS) I also had to install hyperfine and pandas with: ```shell brew install hyperfine pipenv install pandas ```Requirements
(expandable section)
The changes proposed in this pull request should not have any impact on the actual execution of the checks and checkov _Runners_. The effects are only present **before** the runs are triggered. You can verify this yourself by running some local tests. To properly isolate the behaviour changed in this PR and remove any extra sources of noise, I patched the `BaseRunner.run()` to simply return an empty `Report` right away. This can be achieved by creating a new entry point under `checkov/main_patched.py` with the following code: ```python from __future__ import annotations from typing import TypeVar, Generic import checkov.common.runners.base_runner from checkov.common.output.report import Report _Context = TypeVar("_Context", bound="dict[Any, Any]|None") _Definitions = TypeVar("_Definitions", bound="dict[Any, Any]|None") _GraphManager = TypeVar("_GraphManager", bound="GraphManager[Any, Any]|None") class PatchedBaseRunner( checkov.common.runners.base_runner.BaseRunner, Generic[_Definitions, _Context, _GraphManager] ): def __new__(cls, *args, **kwargs): def noop_run(self, *args, **kwargs): return Report(check_type=self.check_type) cls.run = noop_run return super().__new__(cls, *args, **kwargs) checkov.common.runners.base_runner.BaseRunner = PatchedBaseRunner if __name__ == '__main__': from checkov.main import Checkov import sys ckv = Checkov() sys.exit(ckv.run()) ``` Test that it works by running: ```shell pipenv run python -m checkov.main_patched --version ```Isolating the effect of the change
(expandable section)
I executed the following script to generate the results displayed in the next section: ```shell #!/usr/bin/env bash set -o errexit # Exit on error set -o pipefail # Use last non-zero exit code in a pipeline set -o nounset # Disallow expansion of unset variables cmd='python -m checkov.main_patched' common_checkov_flags='--quiet --compact --skip-results-upload --skip-download' bench(){ local branch="${1}" local csv_file="bench_checkov_${branch}.csv" git checkout "${branch}" echo "Benchmarking ${branch}..." hyperfine \ --warmup 3 \ --min-runs 10 \ --max-runs 20 \ --time-unit second \ --shell=none \ --export-csv "${csv_file}" \ "${cmd} ${common_checkov_flags} --version" \ "${cmd} ${common_checkov_flags} --list" \ "${cmd} ${common_checkov_flags} --framework=openapi -d tests/openapi/" \ "${cmd} ${common_checkov_flags} --framework=ansible -d tests/ansible/examples/" \ "${cmd} ${common_checkov_flags} --framework=terraform -d tests/terraform/checks/data" \ "${cmd} ${common_checkov_flags} -d tests/" sed -i "s|${cmd} ${common_checkov_flags} ||g" "${csv_file}" sed -i 's|command|argv|g' "${csv_file}" } bench "main" bench "lazy-cherry" python <Running the benchmark
(expandable section)-d ") print(df.to_markdown(tablefmt="github")) df.to_html("bench_checkov.html", border=0, escape=False, justify="left") EOF # Fix table's text-alignment sed -i 's|| |g' bench_checkov.html
sed -i 's| | |g' bench_checkov.html
echo "HTML table was saved to ./bench_checkov.html"
```
Eager
Lazy
pct
argv
mean
stddev
mean
stddev
mean
0.999
0.034
0.552
0.013
🔻-45%
2.156
0.068
1.878
0.057
🔻-13%
0.940
0.022
0.597
0.030
🔻-36%
0.941
0.037
0.593
0.018
🔻-37%
0.953
0.032
0.767
0.015
🔻-20%
0.989
0.008
0.929
0.022
🔻-6%
Eager
Lazy
pct
argv
mean
stddev
mean
stddev
mean
2.103
0.022
1.242
0.040
🔻-41%
3.643
0.106
3.342
0.051
🔻-8%
2.177
0.065
1.273
0.026
🔻-42%
2.166
0.059
1.245
0.037
🔻-43%
2.113
0.060
1.759
0.037
🔻-17%
2.242
0.035
2.112
0.042
🔻-6%
tpvasconcelos
commented
1 month ago
tpvasconcelos
commented
1 month ago
Eager
Lazy
pct
argv
mean
stddev
mean
stddev
mean
0.999
0.034
0.287
0.002
🔻-71%
2.156
0.068
1.664
0.010
🔻-23%
0.940
0.022
0.303
0.013
🔻-68%
0.941
0.037
0.299
0.002
🔻-68%
0.953
0.032
0.543
0.010
🔻-43%
0.989
0.008
0.755
0.005
🔻-24%
tpvasconcelos
commented
1 month ago
tpvasconcelos
commented
1 month ago
bo156
commented
1 week ago
- Adding extra complexity to the project, which requires specific understanding of the importing mechanism.
- From my experience, affecting the imports in such a way might cause unintended side effects. For example, I believe this might cause debugging to be harder in checkov as it will add the code for "computing" the lazy imports as part of the stack-frames.
- © Githubissues.
- Githubissues is a development platform for aggregating issues.
Results
M3 - macOS
--version
--list
--framework=openapi
-d tests/openapi/
--framework=ansible
-d tests/ansible/examples/
--framework=terraform
-d tests/terraform/checks/data
-d tests/
4-core/8Gb-memory Gitpod instance
--version
--list
--framework=openapi
-d tests/openapi/
--framework=ansible
-d tests/ansible/examples/
--framework=terraform
-d tests/terraform/checks/data
-d tests/
If you feel like the examples I used for the benchmarks are not very representative, let me know and I can run a few more
I have also started working on a similar pull request. See https://github.com/tpvasconcelos/checkov/pull/3 for more details.
Updating the results table above with this new improvement, shows the following results on my M3 mac:
--version
--list
--framework=openapi
-d tests/openapi/
--framework=ansible
-d tests/ansible/examples/
--framework=terraform
-d tests/terraform/checks/data
-d tests/
If I run the pre-commit hook defined in the Motivation section above in a private repository of mine, it only takes 1.38 seconds when running from this branch. While it takes a 2.77 seconds when running against the latest
v3.2.255
release. i.e., -50% improvementHey @gruebel @tsmithv11! Apologies for pinging you directly here but I wanted to check whether this is something you'd be interested in for the checkov project.
In short, the patch I'm suggesting makes checkov between 70% faster (when executing simple commands like
checkov --help
) but also as much as 25% faster when running the much more expensivecheckov -d .
command.If this is something that interests your team, I'd be happy to provide more context where needed, run additional benchmarks, and make the PR ready for review.
Thanks in advance! 🚀
@tpvasconcelos First of all, thank you for this amazing suggestion 💯 I completely agree with you that commands like
checkov --help
shouldn't be affected from the many imports we have to load all python modules.However, I'm hesitant about your specific PR as it introduces a specific mechanism of lazy loading with references to the python internal mechanisms (like references to the stack-frames). The main drawbacks from our side for using those mechanisms are:
Instead, I think you raised an important point which is the fact that we don't need to import all of the code of the checks for a lot of usages in checkov. What I suggest, is trying to remove the imports to all of the
checks
from the relevant__init__
files and test the performance in this case. It might require following the imports in other locations as well to make sure there is no other reference to them.@tpvasconcelos Is that something you are willing to consider/try to implement?