wazuh / wazuh-agent

Wazuh agent, the Wazuh agent for endpoints.
GNU Affero General Public License v3.0
2 stars 1 forks source link

Migrate Syscollector module to new agent #17

Open cborla opened 1 week ago

cborla commented 1 week ago

Description

Migrate Syscollector and DBsync code from the wazuh/wazuh repository to the wazuh/wazuh-agent repository.

Tasks

  1. Identify the code in the wazuh/wazuh repository.
  2. Migrate the code to the wazuh/wazuh-agent repository.
  3. Refactor the code as necessary to fit the new repository structure.
  4. Test the migrated code to ensure it works correctly in the new repository.

Implementation Constraints

  1. Code Migration: The code for Syscollector and DBsync must be migrated to the new repository. This task depends on the migration spike: https://github.com/wazuh/wazuh/issues/24037.
  2. Modular Implementation: The new Inventory will be implemented as a module of the agent, following the predefined scaffolding: #1.
  3. Queue Integration: Messages generated by Syscollector must be inserted into the new Queue component. #16

Dependencies

Migration spike: https://github.com/wazuh/wazuh/issues/24037

nbertoldo commented 1 week ago

Code Migration

It has been identified the code in the wazuh/wazuh repository.

Folder File(s) Keep Notes
architecture syscollector
src/config wmodules-syscollector.c Config will be migrated to TOML
src/headers & src/shared agent_messages_adapter.h/agent_messages_adapter.c
logging_helper.h/logging_helper.c
sym_load.h/sym_load.c 🟡 Remove unused functions
src/shared_modules utils/hashHelper.h
utils/abstractLocking.hpp dbsync
utils/builder.hpp dbsync
utils/cjsonSmartDeleter.hpp dbsync
utils/customDeleter.hpp dbsync
utils/mapWrapperSafe.h dbsync
utils/pipelineNodesImp.h dbsync
utils/pipelinePattern.h dbsync
dbsync
src/wazuh_modules syscollector The full directory is kept
wm_syscollector.h/wm_syscollector.c
tests/integration test_syscollector

Once this code has been migrated to the wazuh/wazuh-agent repository, it must be refactored to fit the new repository structure, as follows:

wazuh-agent/
├── src/
│   ├── CMakeLists.txt
│   ├── vcpkg.json
│   ├── modules/
│   │   ├── inventory/
│   │   │   ├── CMakeLists.txt
│   │   │   ├── include/
│   │   │   │   ├── inventory.hpp
│   │   │   │   └── ...
│   │   │   ├── src/
│   │   │   │   ├── inventory.cpp
│   │   │   │   └── ...
│   │   │   └── tests/
│   │   │       ├── CMakeLists.txt
│   │   │       └── test_inventory.cpp
│   │   └── [additional modules...]
│   ├── common/
│   │   ├── dbsync/
│   │   └── [additional modules...]
│   └── build/
│       └── [build output...]
├── etc/
│   ├── config/
│   ├── selinux/
│   └── ruleset/
│       ├── sca/
│       └── rootcheck/
├── packages/
│   └── installers/
│       ├── unix/ (former init folder, including upgrade.sh and install.sh)
│       └── win32/
└── bump-version.sh    
ncvicchi commented 1 week ago

As a prerequisite for the Inventory or any other module implementation, the wmodule class needs to be implemented.

Analysis of different approaches is being performed.

1st approach - Simple interface

A simple interface implementation as:

interface: ```cpp struct IModule { public: virtual ~IModule() = default; virtual void run() = 0; virtual void stop() = 0; virtual int setup(const Configuration& config) = 0; virtual string command(const string & query) = 0; virtual string name() const = 0; }; ```

could be enough, leaving the implementation of every method specifically to each module.

Although this is a flexible approach, maintenance of every module will be required in such a scenario.

2nd approach - A rigid module class

Another approach could be by implementing all module's behaviour in the parent class:

wmodule.hpp: ```cpp #include #include #include #include template class Wmodule { public: ConfigType* data; // Data (module-dependent structure) // Constructor Wmodule(string name, const wm_context& ctx, ConfigType* d) : name(name), data(d) {} // Destructor ~Wmodule() { destroy(data); } void start(); // Main function nlohmann::json dump(const ConfigType*); // Dump current configuration int sync(const std::string&); // Sync (probable return value can or must be replaced by expectations?) void stop(ConfigType*); // Module detention void query(ConfigType*, nlohmann::json&); // Run a query private: std::string name; // Name for module std::thread thread; // Thread void destroy(ConfigType* data); // Is is necessary? protected: const std::string& getName(); }; ```
wmodule.cpp: ```cpp #include "wmodule.hpp" void Wmodule::start() { std::cout << "Start: " + name + " -> Called" << std::endl; } void Wmodule::synch(const std::string& s) { std::cout << "Synch: " + name + " -> Called" << std::endl; } void Wmodule::stop(ConfigType* c) { std::cout << "Stop: " + name + " -> Called" << std::endl; } void Wmodule::query(ConfigType* c, nlohmann::json& j) { std::cout << "Query: " + name + " -> Called" << std::endl; } void Wmodule::destroy(ConfigType* data){ std::cout << "Destroy: " + name + " -> Called" << std::endl; } const std::string& Wmodule::getName() const { return name; } ```
main.cpp: ```cpp #include #include "wmodule.hpp" // Derived class class DummyModule : public Wmodule { }; int main() { DummyModule dummy; dummy.start(); dummy.stop(); return 0; } ```

3rd approach - A flexible and extendable module class

A mixed solution can also be implemented, where default behaviour is implemented in the parent class, but provides the means to extend every method behaviour in case a module needs it:

wmodule.hpp: ```cpp #include #include #include #include template class Wmodule { public: ConfigType* data; // Data (module-dependent structure) // Constructor Wmodule(string name, const wm_context& ctx, ConfigType* d) : name(name), data(d) {} // Destructor ~Wmodule() { destroy(data); } void start(); // Main function nlohmann::json dump(const ConfigType*); // Dump current configuration int sync(const std::string&); // Sync (probable return value can or must be replaced by expectations?) void stop(ConfigType*); // Module detention void query(ConfigType*, nlohmann::json&); // Run a query private: std::string name; // Name for module std::thread thread; // Thread void destroy(ConfigType* data); // Is is necessary? void p_start(); // Mudule start up generic int p_synch(const std::string&); // Module sync generic void p_stop(ConfigType*); // Module detention generic void p_query(ConfigType*, nlohmann::json&); // Run a query generic // Custom specific extensions virtual int v_synch(const std::string&); virtual void v_start(); virtual void v_stop(ConfigType*); virtual void v_query(ConfigType*, nlohmann::json&); nlohmann::json dump(const ConfigType*); // Dump current configuration protected: const std::string& getName(); }; ```
wmodule.cpp: ```cpp #include "wmodule.hpp" void Wmodule::start() { p_start(); v_start(); } void Wmodule::synch(const std::string& s) { p_synch(s); v_synch(s); } void Wmodule::stop(ConfigType* c) { p_stop(c); v_stop(c); } void Wmodule::query(ConfigType* c, nlohmann::json& j) { p_stop(c, j); v_stop(c, j); } void Wmodule::p_start() { std::cout << "Start: " + name + " -> Called" << std::endl; } void Wmodule::v_start() { std::cout << "Start: " + name + " -> no extension was used." << std::endl; } int Wmodule::p_synch(const std::string& s) { std::cout << "Synch: " + name + " -> Called" << std::endl; } void Wmodule::v_synch(const std::string& s) { std::cout << "Synch: " + name + " -> no extension was used." << std::endl; } void Wmodule::p_stop(ConfigType* c) { std::cout << "Stop: " + name + " -> Called" << std::endl; } void Wmodule::v_stop(ConfigType* c) { std::cout << "Stop: " + name + " -> no extension was used." << std::endl; } void Wmodule::p_query(ConfigType* c, nlohmann::json& j) { std::cout << "Query: " + name + " -> Called" << std::endl; } void Wmodule::v_query(ConfigType* c, nlohmann::json& j) { std::cout << "Query: " + name + " -> no extension was used." << std::endl; } void Wmodule::destroy(ConfigType* data){ } const std::string& Wmodule::getName() const { return name; } ```
main.cpp: ```cpp #include #include "wmodule.hpp" class DummyModule1 : public Wmodule { // DummyModule1 will not use any specific implementation }; // Derived class DummyModule2 class DummyModule2 : public Wmodule { protected: void Wmodule::v_stop(ConfigType* c) { std::cout << "Stop: " + name + "extension used!" << std::endl; } }; int main() { DummyModule1 dummy1; DummyModule2 dummy2; dummy1.start(); dummy2.start(); dummy1.stop(); dummy2.stop(); return 0; } ```

Although the extension is applied after the generic behaviour in this example, both pre & post extensions could be implemented, if necessary.

Comparison

Although the 1st approach is much simpler at first, it delegates its implementation (and therefore its complexity) to every module that will be implemented. This gives a great deal of flexibility to a module implementation but effectively increases maintenance efforts.

The second approach improves maintenance effort requirements but at the expense of dramatically reducing flexibility. It removes all implementations from modules, not allowing the implementation of specific behaviours.

The third approach tries to accommodate both 1st and 2nd implementation limitations by allowing the extension of generic behaviours with specific behaviours:

Feature Approach 1 Approach 2 Approach 3
Simplicity High Medium Medium
Flexibility High Low High
Maintenance Harder Medium Lower
Manageable Limited Simple Simple
nbertoldo commented 3 days ago

Based on this, I have created the base class Modules and a first iteration of the Inventory module, adapting the example to the structure of the new repository and using wrapper and concepts.