Closed eric-forte-elastic closed 8 months ago
Upon further experimentation, we discovered that the simply multi-threading loading the rule files and/or the init of the RuleLoader can have some unintended consequences. While the unit test speed may increase based on configuration (see PR for more details), when one runs a basic instantiation of the RuleLoader, the loading time increases with the multi threading. Given this, it is expected that much of the execution time for loading the rules is I/O bound. As such, I would recommend closing this issue and deferring specific optimizations until we make more broad updates/refactoring to the RuleLoader class.
```python import time from detection_rules.rule_loader import RuleCollection start_time = time.time() rules = RuleCollection.default() end_time = time.time() execution_time = end_time - start_time print(f"Execution time: {execution_time} seconds") ```
Base execution time, no multi-threading. ```shell detection-rules on multi_thread_rule_loader [?] is v0.1.0 via v3.8.18 (venv) on eric.forte took 56s ❯ python test_rule_loader.py Execution time: 76.79581332206726 seconds ``` Multi-threading just load files, which leads to errors with loading. ```shell detection-rules on multi_thread_rule_loader [!?] is v0.1.0 via v3.8.18 (venv) on eric.forte ❯ python test_rule_loader.py Error loading rule in /home/forteea1/Code/clean_mains/detection-rules/rules/integrations/azure/defense_evasion_azure_service_principal_addition.toml Error loading rule in /home/forteea1/Code/clean_mains/detection-rules/rules/integrations/google_workspace/collection_google_drive_ownership_transferred_via_google_workspace.toml Error loading rule in /home/forteea1/Code/clean_mains/detection-rules/rules/integrations/google_workspace/initial_access_external_user_added_to_google_workspace_group.toml Execution time: 236.1852207183838 seconds ``` Multi-threading just the init. ```shell detection-rules on multi_thread_rule_loader [!?] is v0.1.0 via v3.8.18 (venv) on eric.forte ❯ python test_rule_loader.py Execution time: 133.23614525794983 seconds ```
This has been moved to the Foundational Prep Meta and put back on deck.
In effect, may be a duplicate of: https://github.com/elastic/detection-rules/issues/2609
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This has been closed due to inactivity. If you feel this is an error, please re-open and include a justifying comment.
Summary
One of the largest contributors to the time it takes to run unit tests is the rule loader. One part of this that takes significant time is the adding and validating rules in the RuleCollection's class initialization function.
This issue proposes that prior to any potential refactor to the rule loader, we make a minor update to the RuleCollection class to multi thread adding rules via the init. While this is a minor change it should provide noticeably faster load times, and thus faster unit tests.