Component tests - Githubissues

amirmalka commented 7 months ago

Type

enhancement, tests

Description

Added comprehensive component tests for node-agent, including scenarios for alert generation, resource usage monitoring, and malicious behavior simulation.
Implemented a simulated malicious application in Go for testing various security rules.
Developed utility functions for Kubernetes and Prometheus to facilitate component testing.
Updated the crypto mining domains list in the rule engine for accuracy.
Removed unnecessary debug logs from the malware manager for cleaner code.
Introduced new Kubernetes ConfigMap templates for configuring node-agent via Helm charts.
Added a locust file for performing load testing on the node-agent.

Changes walkthrough

Relevant files

Tests

component_test.go `Add Comprehensive Component Tests for Node-Agent` tests/component_test.go Added comprehensive component tests for node-agent functionalities. Tests include scenarios like basic alert tests, all alerts from malicious app, basic load activities, memory leak tests, and process kill in the middle of execution. Utilizes utility functions from 'testutils' for creating namespaces, workloads, and handling Prometheus metrics.	+399/-0
malicious.go `Implement Malicious App Simulation for Testing` tests/images/malicious-app/malicious.go Implemented a malicious application behavior simulation for testing. Includes various malicious activities like unexpected process launch, file access, system calls, and network behaviors. Supports configurable behaviors through environment variables.	+305/-0
k8s.go `Add Kubernetes Utilities for Component Tests` tests/testutils/k8s.go Added utility functions for Kubernetes operations in tests. Functions include creating test workloads, executing commands in pods, and handling application profiles.	+257/-0
prometheus.go `Add Prometheus Interaction Utilities for Tests` tests/testutils/prometheus.go Added utility functions for interacting with Prometheus in tests. Functions for fetching and plotting CPU and memory usage metrics.	+242/-0
locustfile.py `Add Locust File for Load Testing` tests/images/locustimage/locustfile.py Added a locust file for load testing with HTTP requests to a target URL.	+10/-0

Enhancement

r1008_crypto_mining_domain.go `Update Crypto Mining Domains List` pkg/ruleengine/v1/r1008_crypto_mining_domain.go Updated the list of crypto mining domains with correct domain names ending with a period.	+105/-105
malware_manager.go `Clean up Debug Logs in Malware Manager` pkg/malwaremanager/v1/malware_manager.go Removed debug log statements from various functions in the malware manager.	+4/-13

Configuration changes

configmap.yaml

Add Node-Agent ConfigMap Template

tests/chart/templates/node-agent/configmap.yaml

Created a new ConfigMap for node-agent with various configurations
based on values from Helm chart.

+40/-0

✨ PR-Agent usage: Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

codiumai-pr-agent-free[bot] commented 7 months ago

PR Description updated to latest commit (https://github.com/kubescape/node-agent/commit/645cbf135799a682ab03beb7472f0955a052756d)

codiumai-pr-agent-free[bot] commented 7 months ago

PR Review

⏱️ Estimated effort to review [1-5]	4, due to the extensive changes across multiple files including new features, tests, and configurations. The PR introduces a comprehensive component test framework, updates to the chart configurations, and additional resources for testing. Reviewing this PR requires a thorough understanding of the new test framework, the impact of the changes on existing functionalities, and ensuring that the new configurations and resources are correctly implemented.
🧪 Relevant tests	Yes
🔍 Possible issues	Possible Bug: The script `tests/scripts/setup-component-test-cluster.sh` assumes the presence of certain tools (kubectl, helm) and environments (kind, minikube) without checking or handling the absence of these prerequisites. This could lead to failures in environments where these tools are not installed or configured.
🔍 Possible issues	Performance Concern: The resource limits and requests specified in `tests/chart/values.yaml` for various components (e.g., nodeAgent, storage) might not be optimal for all testing environments. It's important to validate these values to prevent potential resource exhaustion or performance bottlenecks.
🔒 Security concerns	No

✨ Review tool usage guide:

**Overview:** The `review` tool scans the PR code changes, and generates a PR review which includes several types of feedbacks, such as possible PR issues, security threats and relevant test in the PR. More feedbacks can be [added](https://pr-agent-docs.codium.ai/tools/review/#general-configurations) by configuring the tool. The tool can be triggered [automatically](https://pr-agent-docs.codium.ai/usage-guide/automations_and_usage/#github-app-automatic-tools-when-a-new-pr-is-opened) every time a new PR is opened, or can be invoked manually by commenting on any PR. - When commenting, to edit [configurations](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/configuration.toml#L23) related to the review tool (`pr_reviewer` section), use the following template: ``` /review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=... ``` - With a [configuration file](https://pr-agent-docs.codium.ai/usage-guide/configuration_options/), use the following template: ``` [pr_reviewer] some_config1=... some_config2=... ``` See the review [usage page](https://pr-agent-docs.codium.ai/tools/review/) for a comprehensive guide on using this tool.

codiumai-pr-agent-free[bot] commented 7 months ago

PR Code Suggestions

Category	Suggestions
Enhancement	Improve error handling by raising an exception instead of just printing an error message. ___ It's recommended to handle the case where `get_active_alerts` returns `None` more gracefully in the `get_alerts` method. Instead of just printing an error message and returning `1`, consider raising a custom exception that can be caught and handled by the caller. This approach provides more control over error handling and makes the code more robust. [tests/component-tests/run.py [69-72]](https://github.com/kubescape/node-agent/pull/245/files#diff-20b60ceaa69a2e0ff7634345b4c64f62017b6d45c4cb1737c7bda2ae34a2fe8bR69-R72) ```diff alerts = get_active_alerts(alert_manager_url) -if not alerts: - print("Could not get alerts") - return 1 +if alerts is None: + raise ValueError("Failed to retrieve active alerts") ```
	Replace fixed sleep time with a dynamic wait mechanism for better test efficiency and reliability. ___ Instead of using a fixed sleep time to wait for garbage collection or other asynchronous operations, consider implementing a more dynamic wait mechanism. For example, you can poll a specific condition at regular intervals until it's met or a timeout is reached. This approach can make tests run faster in cases where the condition is met sooner than the fixed wait time and more reliable in cases where more time is needed. [tests/component-tests/load_10k_alerts_no_memory.py [46-47]](https://github.com/kubescape/node-agent/pull/245/files#diff-785423cbbbd3c23a592a9c54c34cfbd78a7c5f8df23b7c87263453b75c2e0f6cR46-R47) ```diff -print("Waiting 300 seconds to have a baseline memory usage") -time.sleep(300) +timeout = 300 +start_time = time.time() +while time.time() - start_time < timeout: + # Check condition here (e.g., memory usage stabilization) + if condition_met: + break + time.sleep(10) # Check every 10 seconds ```
	Use `subprocess.run` for better subprocess management. ___ Instead of using `subprocess.call` and `subprocess.check_call` directly, consider using `subprocess.run` for better clarity and control over subprocess execution. This also allows capturing output and errors more effectively. [tests/component-tests/kubernetes_wrappers.py [14-15]](https://github.com/kubescape/node-agent/pull/245/files#diff-6d6d4539fc2bb63c97c4ad914db829999aa68ccff05c6ce63bb8517be57ede90R14-R15) ```diff -if subprocess.call(["kubectl", "get", "namespace", self.ns_name]) != 0: - subprocess.check_call(["kubectl", "create", "namespace", self.ns_name]) +proc = subprocess.run(["kubectl", "get", "namespace", self.ns_name], capture_output=True) +if proc.returncode != 0: + subprocess.run(["kubectl", "create", "namespace", self.ns_name], check=True) ```
Best practice	Close matplotlib figures after saving to prevent memory leaks. ___ When plotting with matplotlib, it's a good practice to close the figure after saving it to free up resources. This is especially important in scripts that generate many figures, as it can prevent memory leaks and ensure that the backend resources are properly managed. [tests/component-tests/promtopic.py [73-74]](https://github.com/kubescape/node-agent/pull/245/files#diff-b3f961c69e8cf2dbd847f70baf64e4bdb4cd0345e2e9cc5b12e5b4e53a31c606R73-R74) ```diff plt.savefig('%s.png'%filename) plt.clf() +plt.close() ```
	Add a timeout to subprocess calls to avoid indefinite hanging. ___ The subprocess calls in the test functions use both `check_call` and `check_output` methods without specifying a timeout. It's a good practice to specify a timeout for these calls to avoid hanging indefinitely if the command does not complete. This makes the tests more robust and prevents them from stalling due to unforeseen issues. [tests/component-tests/kill_in_the_middle.py [18]](https://github.com/kubescape/node-agent/pull/245/files#diff-93d8650b6caf90839ed06aee168bbe3cae8a26f2ffce0aaaa1d0d2d475213c50R18-R18) ```diff -subprocess.check_call(["kubectl", "-n", namespace , "apply", "-f", "dev/nginx/nginx-deployment.yaml"]) +subprocess.check_call(["kubectl", "-n", namespace , "apply", "-f", "dev/nginx/nginx-deployment.yaml"], timeout=300) ```
	Use a context manager for subprocess calls to prevent resource leaks. ___ Consider using a context manager (Python's `with` statement) for subprocess calls where `capture_output=True` is used. This ensures that subprocesses are properly cleaned up after execution, which can help prevent resource leaks. [tests/component-tests/creation_app_profile_memory_leak.py [37-39]](https://github.com/kubescape/node-agent/pull/245/files#diff-ec3fec66c497ceba2fbf47e9d28011e163a043b2e23cbf28121bff66a0e32111R37-R39) ```diff -get_proc = subprocess.run(["kubectl", "-n", os.environ.get("STORE_NAMESPACE"), "get", "applicationprofiles", f"pod-{nginx_pod_name}-test-namespace", "-oyaml"], capture_output=True) +with subprocess.Popen(["kubectl", "-n", os.environ.get("STORE_NAMESPACE"), "get", "applicationprofiles", f"pod-{nginx_pod_name}-test-namespace", "-oyaml"], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc: + stdout, stderr = proc.communicate() + get_proc = subprocess.CompletedProcess(args=proc.args, returncode=proc.returncode, stdout=stdout, stderr=stderr) ```
	Replace hard-coded IP addresses with DNS names. ___ It is recommended to avoid using hard-coded IP addresses in Kubernetes resource definitions. Instead, use DNS names that can be resolved within the cluster or externally, depending on the requirement. This makes the configuration more flexible and less prone to errors if the IP addresses change. [tests/component-tests/resources/malicious-job-app-profile-namespaced.yaml [34-45]](https://github.com/kubescape/node-agent/pull/245/files#diff-6e13638f8573ff8ee5b6c254910c1a2a20ef8d53c006ccbdddc64d8067ba5213R34-R45) ```diff -- dstEndpoint: 10.96.0.10 -- dstEndpoint: 34.107.204.206 -- dstEndpoint: 10.96.0.1 -- dstEndpoint: 142.251.37.68 +- dstEndpoint: +- dstEndpoint: +- dstEndpoint: +- dstEndpoint: ```
	Define resource limits and requests for containers. ___ Consider specifying resource limits and requests for containers to ensure that the application does not consume more resources than expected. This practice helps in avoiding resource contention and ensures better stability and performance of the application. [tests/component-tests/resources/malicious-job-app-profile-namespaced.yaml [10-11]](https://github.com/kubescape/node-agent/pull/245/files#diff-6e13638f8573ff8ee5b6c254910c1a2a20ef8d53c006ccbdddc64d8067ba5213R10-R11) ```diff containers: -- dns: +- resources: + limits: + cpu: "1" + memory: "500Mi" + requests: + cpu: "0.5" + memory: "200Mi" + dns: ```
	Replace hardcoded IP addresses with DNS names. ___ It's recommended to avoid using hardcoded IP addresses in Kubernetes configurations. Instead, consider using DNS names that can be resolved within the cluster or externally, depending on the use case. This approach enhances maintainability and flexibility of the configuration. [tests/component-tests/resources/malicious-job-app-profile.yaml [32-43]](https://github.com/kubescape/node-agent/pull/245/files#diff-61b7b40772aa82e0edee0726dfd2ca9ae923477492009ffb6e6738497a20a105R32-R43) ```diff -- dstEndpoint: 10.96.0.10 -- dstEndpoint: 34.107.204.206 -- dstEndpoint: 10.96.0.1 -- dstEndpoint: 142.251.37.68 +- dstEndpoint: ..svc.cluster.local +- dstEndpoint: +- dstEndpoint: ..svc.cluster.local +- dstEndpoint: ```
	Use `O_RDONLY` for files that don't require write access. ___ Avoid using `O_RDWR` flag for files that do not require write access. For example, configuration files or certificates should typically be opened with `O_RDONLY` to minimize the risk of accidental or malicious modifications. [tests/component-tests/resources/malicious-job-app-profile.yaml [86-87]](https://github.com/kubescape/node-agent/pull/245/files#diff-61b7b40772aa82e0edee0726dfd2ca9ae923477492009ffb6e6738497a20a105R86-R87) ```diff - flags: - - O_RDWR - - O_CREAT - - O_EXCL + - O_RDONLY - O_CLOEXEC path: /root/.kube/cache/http/.diskv-temp/1551865526 ```
Maintainability	Avoid modifying global variables based on command-line arguments to prevent unexpected behavior. ___ The global variables `alert_manager_url` and `prometheus_url` are modified based on command-line arguments, which can lead to unexpected behavior when running tests multiple times in the same session. Consider passing these URLs as parameters to functions that need them or encapsulating them within a class to avoid potential issues with global state. [tests/component-tests/run.py [85-90]](https://github.com/kubescape/node-agent/pull/245/files#diff-20b60ceaa69a2e0ff7634345b4c64f62017b6d45c4cb1737c7bda2ae34a2fe8bR85-R90) ```diff -global alert_manager_url -global prometheus_url -if len(sys.argv) > 1: - alert_manager_url = sys.argv[1] -if len(sys.argv) > 2: - prometheus_url = sys.argv[2] +def main(alert_manager_url="http://localhost:9093/", prometheus_url="http://localhost:9090/"): + if len(sys.argv) > 1: + alert_manager_url = sys.argv[1] + if len(sys.argv) > 2: + prometheus_url = sys.argv[2] + # Pass these URLs as parameters to functions that need them ```
	Refactor repeated subprocess calls into a helper function for better readability. ___ To improve the readability and maintainability of the test cases, consider refactoring the repeated subprocess calls into a helper function. This function could take the file path as an argument and return the `subprocess.run` result. [tests/component-tests/rule_binding_apply_test.py [7-8]](https://github.com/kubescape/node-agent/pull/245/files#diff-42c5cb7eabc8e70d993b289c0e75ca19a1f611717bbda837c826d0bdc1c738fbR7-R8) ```diff -proc_stat = subprocess.run(["kubectl", "apply", "-f", "tests/component-tests/rule_binding_crds_files/all-valid.yaml"], check=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE) +def run_kubectl_apply(file_path): + return subprocess.run(["kubectl", "apply", "-f", file_path], check=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE) +proc_stat = run_kubectl_apply("tests/component-tests/rule_binding_crds_files/all-valid.yaml") + ```
	Simplify DNS management by grouping related DNS names. ___ For better maintainability and readability, consider grouping related DNS names under a single domain to minimize the number of entries and simplify DNS management. [tests/component-tests/resources/malicious-job-app-profile-namespaced.yaml [12-16]](https://github.com/kubescape/node-agent/pull/245/files#diff-6e13638f8573ff8ee5b6c254910c1a2a20ef8d53c006ccbdddc64d8067ba5213R12-R16) ```diff -- dnsName: dl.k8s.io.default.svc.cluster.local. -- dnsName: dl.k8s.io.svc.cluster.local. -- dnsName: dl.k8s.io.cluster.local. -- dnsName: dl.k8s.io.wework.com. -- dnsName: dl.k8s.io. +- dnsName: "*.dl.k8s.io." ```
Reliability	Add a retry mechanism with exponential backoff for checking expected alerts. ___ For better test reliability, consider adding a retry mechanism with exponential backoff when checking for expected alerts. This can help account for delays in alert generation and propagation. [tests/component-tests/all_alerts_from_malicious_app.py [54-57]](https://github.com/kubescape/node-agent/pull/245/files#diff-21b7d56a0387f8825d5bbd2da9cce51e248e4c4f2bd143c75918d7c2b16d0cc6R54-R57) ```diff -for alert in alerts: - rule_name = alert['labels']['rule_name'] - if rule_name in expected_alerts: - expected_alerts.remove(rule_name) +import time +retries = 3 +delay = 2 # seconds +while retries > 0 and expected_alerts: + for alert in alerts: + rule_name = alert['labels']['rule_name'] + if rule_name in expected_alerts: + expected_alerts.remove(rule_name) + if expected_alerts: + time.sleep(delay) + delay *= 2 + retries -= 1 + alerts = test_framework.get_alerts(namespace=ns) # Refresh alerts ```
Reliability	Add error handling for subprocess and requests calls in `pprof_recorder`. ___ To ensure that the `pprof_recorder` class is more robust, consider adding error handling for the subprocess and requests calls. This could include catching exceptions and providing meaningful error messages or fallbacks. [tests/component-tests/pprof.py [13-29]](https://github.com/kubescape/node-agent/pull/245/files#diff-5587672b163c77249bf01665875e2f64dfb5fbd810002742f61cc268e219601cR13-R29) ```diff -self.proc = subprocess.Popen(port_forward_command, shell=True) -response = requests.get(url) +try: + self.proc = subprocess.Popen(port_forward_command, shell=True) +except Exception as e: + print(f"Failed to start port-forwarding: {e}") + # Handle error or cleanup +try: + response = requests.get(url) + response.raise_for_status() +except requests.exceptions.RequestException as e: + print(f"Failed to record pprof data: {e}") + # Handle error or cleanup + ```
Security	Review and restrict file operation permissions to enhance security. ___ Ensure that the use of broad file permissions, especially with `O_RDWR` and `O_CREAT`, is intentional and secure. Overly permissive file operations can lead to security vulnerabilities, such as unauthorized file access or modification. [tests/component-tests/resources/malicious-job-app-profile-namespaced.yaml [47-55]](https://github.com/kubescape/node-agent/pull/245/files#diff-6e13638f8573ff8ee5b6c254910c1a2a20ef8d53c006ccbdddc64d8067ba5213R47-R55) ```diff - flags: - - O_RDWR - - O_CREAT - - O_TRUNC + - O_RDONLY - O_CLOEXEC path: /kubectl ```
	Review and minimize the set of allowed syscalls for security. ___ To enhance the security posture, consider reviewing and minimizing the set of allowed syscalls. Limiting syscalls can help in reducing the attack surface of the container and protect against certain types of exploits. [tests/component-tests/resources/malicious-job-app-profile-namespaced.yaml [377-379]](https://github.com/kubescape/node-agent/pull/245/files#diff-6e13638f8573ff8ee5b6c254910c1a2a20ef8d53c006ccbdddc64d8067ba5213R377-R379) ```diff - access -- arch_prctl -- bind +# Review and potentially remove unnecessary syscalls ```
	Review and secure `execs` operations. ___ Ensure that the use of `execs` with potentially sensitive operations, such as accessing secrets or executing commands that can alter the cluster state, is intentional and follows the principle of least privilege. Review and apply appropriate Role-Based Access Control (RBAC) policies to limit the permissions of the containers. [tests/component-tests/resources/malicious-job-app-profile.yaml [24-28]](https://github.com/kubescape/node-agent/pull/245/files#diff-61b7b40772aa82e0edee0726dfd2ca9ae923477492009ffb6e6738497a20a105R24-R28) ```diff - args: - get - secrets envs: [] path: ./kubectl +# Ensure appropriate RBAC policies are applied ```
	Ensure restrictive file permissions for sensitive operations. ___ For file operations, consider specifying more restrictive permissions when creating files, especially for sensitive locations such as `/root/.kube/cache`. Using `O_CREAT` with `O_EXCL` is good, but ensure the permissions are set to avoid unauthorized access. [tests/component-tests/resources/malicious-job-app-profile.yaml [86-87]](https://github.com/kubescape/node-agent/pull/245/files#diff-61b7b40772aa82e0edee0726dfd2ca9ae923477492009ffb6e6738497a20a105R86-R87) ```diff - flags: - O_RDWR - O_CREAT - O_EXCL - O_CLOEXEC path: /root/.kube/cache/http/.diskv-temp/1551865526 +# Consider setting more restrictive permissions ```
	Review and secure network activity configurations. ___ Ensure that network activities, especially those involving potentially sensitive endpoints or non-standard ports, are necessary and secure. Review the network activity configuration to confirm that it aligns with the intended behavior and security policies of the application. [tests/component-tests/resources/malicious-job-app-profile.yaml [30-43]](https://github.com/kubescape/node-agent/pull/245/files#diff-61b7b40772aa82e0edee0726dfd2ca9ae923477492009ffb6e6738497a20a105R30-R43) ```diff networkActivity: outgoing: - - dstEndpoint: 10.96.0.10 + - dstEndpoint: ..svc.cluster.local port: 53 protocol: UDP - - dstEndpoint: 34.107.204.206 + - dstEndpoint: port: 443 protocol: TCP +# Review and confirm network activity configuration ```

✨ Improve tool usage guide:

**Overview:** The `improve` tool scans the PR code changes, and automatically generates suggestions for improving the PR code. The tool can be triggered [automatically](https://pr-agent-docs.codium.ai/usage-guide/automations_and_usage/#github-app-automatic-tools-when-a-new-pr-is-opened) every time a new PR is opened, or can be invoked manually by commenting on a PR. - When commenting, to edit [configurations](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/configuration.toml#L78) related to the improve tool (`pr_code_suggestions` section), use the following template: ``` /improve --pr_code_suggestions.some_config1=... --pr_code_suggestions.some_config2=... ``` - With a [configuration file](https://pr-agent-docs.codium.ai/usage-guide/configuration_options/), use the following template: ``` [pr_code_suggestions] some_config1=... some_config2=... ``` See the improve [usage page](https://pr-agent-docs.codium.ai/tools/improve/) for a comprehensive guide on using this tool.

amirmalka commented 7 months ago

@dwertent The PR is not ready for review...