dnsmichi commented 5 years ago

Story

TL;DR - performance analysis and debugging on the throughput of checks. Scroll down for problems, technical analysis and solutions.

Issues

ref/NC/628417 ref/NC/625049

Our CIB's current_concurrent_checks are wrong #7416
File descriptors are leaked to child processes which makes SELinux unhappy #2847
Improve process execution helper handling #7332
Consider migrating to Boost fibers #7228
Resetting SIGPIPE for checkcommand child processes #6912

Problems

Performance

The multi-plexed unix socket pair with pipe streams and event FD polling introduces several locks with concurrent check runs waiting for each other. Also the throughput with the thread pool is blocked with this.

7421 exposes this with current_pending_callbacks as metric.

7416 discusses a problem where the actual running checks are different to what's actually executed as processes. Whenever the thread pool is locked up, no more additional checks can be executed. This is different to the actual running processes though.

In theory, the throughput of executed processes must be faster than 1000 per second for instance.

Too many open files

Whenever MaxConcurrentChecks is raised to a higher limit than the system's open file descriptors, the process pipes may exceed this limit.

The following fork() calls then resulted in errors with too many open files preventing the fork thus returning into a critical check result.

This can be solved with raising the number of open files within Systemd and Icinga 2.

Late Check Results

The initial check did not happen when scheduled, the thread pool blocked or no ressources where available. This is seen inside the check latency.
The check result from the finished handler is not processed inside the thread pool. This hinders the call to ProcessCheckResult() which also increases the latency with SetScheduleEnd() not being set then.

Technical Details

Check Execution Flow

Check Scheduler

The checker component works as check scheduler, each checkable object is stored with its next execution time.

Once a check should be executed, there's additional guards in place to ensure that not all 10000 checks run in the same second for instance (MaxConcurrentChecks). At this time, the scheduler waits up until the remaining running processes finish and free up slots.

The check task is posted into the application's thread pool, and the scheduler continues with other checkables.

lib/checker/checkercomponent.cpp

Utility::QueueAsyncCallback(std::bind(&CheckerComponent::ExecuteCheckHelper, CheckerComponent::Ptr(this), checkable));

Global Task Thread Pool

Once the thread pool works on the task, the registered callback function CheckerComponent::ExecuteCheckHelper is executed.

lib/base/threadpool.hpp

                        boost::asio::post(*m_Pool, [this, callback]() {
...
                                try {
                                        callback();
                                } catch (const std::exception& ex) {
...
                                }
                        });

The callback function actually runs the check and instruments the scheduler upon finishing later.

lib/checker/checkercomponent.cpp

                checkable->ExecuteCheck();

Asynchronous Check Execution

ExecuteCheck prepares the check result and also decides whether to run the check locally, or via command endpoint as cluster message. The latter involves resolving the macros with a simulated check run first. This is not scope of this workflow though.

A local check is fired based on the check_command attribute and its registered execution method.

lib/icinga/checkable-check.cpp

                GetCheckCommand()->Execute(this, cr, nullptr, false);

Generic Task Execution Functions

The following requires knowledge about the default templates and also execute methods for commands. Older versions of Icinga required to import the plugin-check-command template which has hidden the following from the user:

execute = PluginCheck

Newer versions of Icinga have that as default and for example a CheckCommand already knows that it will execute a PluginCheck task.

There are a few other tasks registered, e.g. for built-in memory checks like icinga, cluster-zone, etc. These built-in check command register different tasks than PluginCheck. This can be seen in lib/methods/methods-itl.conf which is compiled into the binary.

The CheckCommand Execute() method gets the execute attribute and calls the registered Invoke() interface. This allows to have multiple execution types with a single implementation (OOP).

lib/icinga/checkcommand.cpp

        GetExecute()->Invoke({

PluginCheck Task

The PluginCheck task has a script function registered. Each method is put into the lib/methods library which is highly extendable (for developers) with the simple interface used here.

lib//methods/pluginchecktask.cpp

REGISTER_FUNCTION_NONCONST(Internal, PluginCheck,  &PluginCheckTask::ScriptFunc, "checkable:cr:resolvedMacros:useResolvedMacros");

The macro takes care of fetching the ScriptFunction from the global context, being PluginCheck in this case. The registered callback function PluginCheckTask::ScriptFunc is then invoked on call.

lib/methods/pluginchecktask.cpp

void PluginCheckTask::ScriptFunc(const Checkable::Ptr& checkable, const CheckResult::Ptr& cr,
        const Dictionary::Ptr& resolvedMacros, bool useResolvedMacros)

A few preparations are taking place before check execution:

Create a macro resolver list for this specific context. If it is a host check, only add the current host object. For service checks, host and service objects are added. Plus, one may have defined default values inside the CheckCommand too.
Evaluate which check timeout value to use: Either from the CheckCommand.timeout value, or overridden from the host/service.check_timeout attribute

Then again, it fires a generic implementation for executing a command.

lib/methods/pluginchecktask.cpp

        PluginUtility::ExecuteCommand(commandObj, checkable, checkable->GetLastCheckResult(),
                resolvers, resolvedMacros, useResolvedMacros, timeout,
                std::bind(&PluginCheckTask::ProcessFinishedHandler, checkable, cr, _1, _2));

Note the registered callback function ProcessFinishedHandler which later takes care on creating the final check result with output, status from exit code, etc.

Prepare and Execute the Command

The function interface has quite a few things to pass which are needed for check execution.

lib/icinga/pluginutility.cpp

void PluginUtility::ExecuteCommand(const Command::Ptr& commandObj, const Checkable::Ptr& checkable,
        const CheckResult::Ptr& cr, const MacroProcessor::ResolverList& macroResolvers,
        const Dictionary::Ptr& resolvedMacros, bool useResolvedMacros, int timeout,
        const std::function<void(const Value& commandLine, const ProcessResult&)>& callback)

Now the following happens:

Fetch the command array
Fetch the command arguments, if any
Resolve all macros and create the full command line as array of elements.

                command = MacroProcessor::ResolveArguments(raw_command, raw_arguments,
                        macroResolvers, cr, resolvedMacros, useResolvedMacros);

If that fails, the error is captured as output and returned as UNKNOWN check result to the caller.

Resolve the environment variables specified in env.
Prepare the command line and adjust process priority (nice(5)) and timeout.

        Process::Ptr process = new Process(Process::PrepareCommand(command), envMacros);

        process->SetTimeout(timeout);
        process->SetAdjustPriority(true);

Then the final process is being run. This re-binds the callback function to parse the returned output and exit state later on.

        process->Run(std::bind(callback, command, _1));

Enter the Process rabbit hole

The previous steps are all required to support passing check command parameters via custom variables from the DSL. The macro resolver also takes care about running the lambda functions at runtime for instance.

Now that the command line is in full flux, we continue. The following parts reflect the implementation in Icinga 2.10.

lib/base/process.cpp

void Process::Run(const std::function<void(const ProcessResult&)>& callback)

At this stage, Linux and Windows differ a bit. The following description focuses on Linux/Unix where we support running the check scheduler in high performance environments, Windows are just for low energy agent checks.

Threads are initialized once, if they haven't been before
The process result is populated with the execution start metric

lib/base/process.cpp

        m_Result.ExecutionStart = Utility::GetTime();

Create a pipe for mapping the file descriptors (stdin, stdout, stderr). This ensures that the main process can read the plugin output immediately.
Spawn a new process with the helper

lib/base/process.cpp

        m_Process = ProcessSpawn(m_Arguments, m_ExtraEnvironment, m_AdjustPriority, fds);

The returned process is stored for later asynchronous event processing with reading the output from the plugin process. This is based on a limited number of IO threads, 8 in 2.10.

        int tid = GetTID();

        {
                boost::mutex::scoped_lock lock(l_ProcessMutex[tid]);
                l_Processes[tid][m_Process] = this;
#ifndef _WIN32
                l_FDs[tid][m_FD] = m_Process;
#endif /* _WIN32 */
        }

Process Spawn Helper

This is the secondary process visible with pidof. It listens on a Unix socket pair for event messages from the main process.

It gets created via StartSpawnProcessHelper as a forked process, and starts listening in the ProcessHelper() function as endless loop.

Decodes the received JSON message and fires specific processes:

                if (command == "spawn")
                        response = ProcessSpawnImpl(&msg, request);
                else if (command == "waitpid")
                        response = ProcessWaitPIDImpl(&msg, request);
                else if (command == "kill")
                        response = ProcessKillImpl(&msg, request);
                else
                        response = Empty;

spawn executes a command, waitpid waits until the command process is finished, and kill ensures that plugin timeouts are correctly dealt with.

The response is JSON-encoded and sent back via unix socket pair.

                String jresponse = JsonEncode(response);

                if (send(l_ProcessControlFD, jresponse.CStr(), jresponse.GetLength(), 0) < 0) {

Process Spawn

This creates a spawn command which instruments the process spawn helper to create a new process for this plugin execution.

        do {
                while (sendmsg(l_ProcessControlFD, &msg, 0) < 0) {
                        StartSpawnProcessHelper();
                }
        } while (send(l_ProcessControlFD, jrequest.CStr(), jrequest.GetLength(), 0) < 0);

l_ProcessControlFD links to the Unix socket pair descriptor which is used for communicating with the process spawn helper. This sequence also holds the pipe for stdout/stderr for the plugin's process.

Actually spawn a process and execute something

This is done by the helper process in ProcessSpawnImpl().

The environment is populated from passed `env´
argv and argc are compiled from the command array.
LC_NUMERIC=C is set to ensure that commas are real commas

Then the child process is forked.

        pid_t pid = fork();

        int errorCode = 0;

        if (pid < 0)
                errorCode = errno;

        if (pid == 0) {
                // child process

Its priority is niced, if existing.

The actual execution happens in the built-in icinga2_execvpe which uses an optimized vfork() implementation trimmed for best performance.

                if (icinga2_execvpe(argv[0], argv, envp) < 0) {
                        char errmsg[512];
                        strcpy(errmsg, "execvpe(");
                        strncat(errmsg, argv[0], sizeof(errmsg) - strlen(errmsg) - 1);
                        strncat(errmsg, ") failed", sizeof(errmsg) - strlen(errmsg) - 1);
                        errmsg[sizeof(errmsg) - 1] = '\0';
                        perror(errmsg);
                        _exit(128);
                }

The output is written directly on the FD which is mapped as pipe to the main process. You can see that with lsof -p <mainprocessPID>.

The errno and pid is returned.

The Async Process Event Loop

Again in the main process in void Process::IOThreadProc(int tid) - this gets spawned during the startup of Run(), just once.

Note

Now it is getting wild.

There are several registered IO threads, namely 4. All of them listen on the registered poll FDs from the Unix socket pair.

                        pfds[0].fd = l_EventFDs[tid][0];
                        pfds[0].events = POLLIN;
                        pfds[0].revents = 0;

Whenever poll signals a new event ...

                int rc = poll(pfds, count, timeout);

... the current IO thread reads.

                        if (pfds[0].revents & (POLLIN | POLLHUP | POLLERR)) {
                                char buffer[512];
                                if (read(l_EventFDs[tid][0], buffer, sizeof(buffer)) < 0)
                                        Log(LogCritical, "base", "Read from event FD failed.");
                        }

And once it has read enough, it calls DoEvents.

                                if (pfds[i].revents & (POLLIN | POLLHUP | POLLERR) || is_timeout) {

                                        if (!it->second->DoEvents()) {

DoEvents aka do something

Check whether the process runs for too long - terminate and kill it based on the timeout value
Read data from the piped FD as m_OutputStream

                        int rc = read(m_FD, buffer, sizeof(buffer));

Note: This is the pipe created in the beginning and wrapped into the process spawn helper.

Calculate the exit code
Create the result
Run the callback which is the ProcessFinishedHandler - again, asynchronously via thread pool

        m_Result.PID = m_PID;
        m_Result.ExecutionEnd = Utility::GetTime();
        m_Result.ExitStatus = exitcode;
        m_Result.Output = output;

        if (m_Callback)
                Utility::QueueAsyncCallback(std::bind(m_Callback, m_Result));

Process the check result

Back to lib/methods/pluginchecktask.cpp -

void PluginCheckTask::ProcessFinishedHandler(const Checkable::Ptr& checkable, const CheckResult::Ptr& cr, const Value& commandLine, const ProcessResult& pr)

This reads everything from the process result and creates a new check result being processed. Since this ran in the thread pool asynchronously, it may occur that it took quite a while on heavily stressed systems.

        String output = pr.Output.Trim();

        std::pair<String, String> co = PluginUtility::ParseCheckOutput(output);
        cr->SetCommand(commandLine);
        cr->SetOutput(co.first);
        cr->SetPerformanceData(PluginUtility::SplitPerfdata(co.second));
        cr->SetState(PluginUtility::ExitStatusToState(pr.ExitStatus));
        cr->SetExitStatus(pr.ExitStatus);
        cr->SetExecutionStart(pr.ExecutionStart);
        cr->SetExecutionEnd(pr.ExecutionEnd);

        checkable->ProcessCheckResult(cr);

At this point the whole check execution thingy is done. Magic, isn't it?

Changes

MaxConcurrentChecks

Limits the currently running processes, defaults to 512.

Sandboxed processes

2.6 introduced a process spawn helper which executes the checks in an isolated environment. This was to ensure that no file descriptors from the main process leak into the plugin's process.

This can be

IDO connection sockets
Log file handles
Replay log file handles
Runtime config file handles (in the event of creating a new object)

This was implemented for solving #2847.

Solutions

More IO threads?
More than one Process spawn helper?
Reduce the locks
Get rid of the spawn helper and close leaking FDs on our own
Re-implement the process execution with Boost.Process/ASIO or something else
Evaluate embedded perl (not exactly as part of this issue, just a reminder for a technical sprint)

dnsmichi commented 5 years ago

Calculate average service latency

var l = 0; var c = 0; for (s in get_objects(Service)) { if (s.last_check_result) { var ex_delta = s.last_check_result.execution_end - s.last_check_result.execution_start; var s_delta = s.last_check_result.schedule_end - s.last_check_result.schedule_start; var latency = s_delta - ex_delta; l += latency; c += 1 } }; var avg_latency = l / c; avg_latency

Example for 1000 services, check_interval = 10s calling a perl plugin which sleeps for 10s.

<112> => var res = {}; var l = 0; var c = 0; for (s in get_objects(Service)) { if (s.last_check_result) { var ex_delta = s.last_check_result.execution_end - s.last_check_result.execution_start; var s_delta = s.last_check_result.schedule_end - s.last_check_result.schedule_start; var latency = s_delta - ex_delta; l += latency; c += 1 } }; var avg_latency = l / c; avg_latency
0.012186
<113> => var res = {}; var l = 0; var c = 0; for (s in get_objects(Service)) { if (s.last_check_result) { var ex_delta = s.last_check_result.execution_end - s.last_check_result.execution_start; var s_delta = s.last_check_result.schedule_end - s.last_check_result.schedule_start; var latency = s_delta - ex_delta; l += latency; c += 1 } }; var avg_latency = l / c; avg_latency
0.012526
<114> => var res = {}; var l = 0; var c = 0; for (s in get_objects(Service)) { if (s.last_check_result) { var ex_delta = s.last_check_result.execution_end - s.last_check_result.execution_start; var s_delta = s.last_check_result.schedule_end - s.last_check_result.schedule_start; var latency = s_delta - ex_delta; l += latency; c += 1 } }; var avg_latency = l / c; avg_latency
0.013776
<115> => var res = {}; var l = 0; var c = 0; for (s in get_objects(Service)) { if (s.last_check_result) { var ex_delta = s.last_check_result.execution_end - s.last_check_result.execution_start; var s_delta = s.last_check_result.schedule_end - s.last_check_result.schedule_start; var latency = s_delta - ex_delta; l += latency; c += 1 } }; var avg_latency = l / c; avg_latency
0.013695
<116> => var res = {}; var l = 0; var c = 0; for (s in get_objects(Service)) { if (s.last_check_result) { var ex_delta = s.last_check_result.execution_end - s.last_check_result.execution_start; var s_delta = s.last_check_result.schedule_end - s.last_check_result.schedule_start; var latency = s_delta - ex_delta; l += latency; c += 1 } }; var avg_latency = l / c; avg_latency
0.013738

Get late service check results

get_objects(Service).filter(s => s.last_check < DateTime().value - 2 * s.check_interval).map(s => s.__name).len()

dnsmichi commented 5 years ago

7428 adds a CheckCommand object for the ITL for better performance analysis.

Things tried thus far

Multiple process & IO threads - no real benefit
Locking inside ProcessSpawnHelper - no real change
Moving the process fork into the main process again - real decrease in performance, Thread-Pool tasks wait 300s or more in tests.

Open discussion

Boost Fibers - affects the entire core
Independent checker schedulers & processes

dnsmichi commented 5 years ago

@Al2Klimov Can you please add the findings from your Boost fiber analysis?

Al2Klimov commented 5 years ago

dnsmichi commented 5 years ago

Summary CW 35

From a technical view point, the structure how checks are executed and running is just fine. With the analysis above, specific things have become more clear:

The open pipe handles are needed to immediately stream the plugin's output from the child execution process to the parent process waiting for the final execution. Otherwise this would be sent over the unix socket pair becoming even more a bottleneck. So this technical design is the best you can get in this situation.

Latency

Whenever a check is executed, a task is inserted into the global tread pool. If the pool hits the maximum number of threads (by concurrency = cpu core count), the remaining tasks are in a waiting list. 2.11 adds the pending callbacks as additional metric to inspect whether the TP is overloaded.
Whenever a check has finished, the result is put as asynchronous task into the global thread pool. If the TP is overloaded, the CR task has to wait.

If one or two of the above are true, check latency will increase. schedule_start is set when the check scheduler effectively inserts the task into the TP, schedule_end is set upon processing the check result.

Possible root causes

Plugin execution time.

If you run 1000 check_nwc_health in parallel, with a memory footprint of 3.3 mb for each Perl plugin, this takes a while to start, compile, execute and finish. Since this also includes SNMP tree iterations, such checks may be slower than the average ping.

If ping checks always run into a timeout, the default of 10 or 30 second applies. During this time the process is "stalled" in idle wait for ping to return. If you have 1000 tasks in the TP, and 1000 are pings which run into a timeout, you have the same problem as with a HTTP API or JSON-RPC server - it just won't accept new connections, or, in this case, it won't execute any more tasks.

Given that you may execute 200k services in a 5 to 10 minute interval, that's 20 to 40k per minute worst case. Let's stick with an average of 30k per minute, which renders this to

30000 / 60 = 500 per second

Granted that the max threads in the pool follow the CPU cores 2, resulting in e.g. 96 2 possible threads working on tasks, the general throughput depends on the check execution tasks to be really fast (< 1s). With the aforementioned timeouts, this is impossible to reach.

Threads vs CPU power

If you have many CPU cores, say 32, this allows for more threads in the thread pool. If those CPU cores are low end with say 800mhz, you won't win price compared to 4 cores with 3ghz. That gets even worse if context switches between threads become slower.

Check rate

If you would run 30k service checks per minute, this creates at least one cluster message for each check result being returned from the satellite to the master (and also the secondary satellite). The memory footprint of the CR mainly depends on the plugin output and performance data, and may range from a few bytes to many KBs too.

If you're not on a low profile bandwidth, this shouldn't harm the network layers to just send over the TCP packages resulting in JSON-RPC messages. Still, the messages are wrapped into JSON which needs to be encoded/decoded on each end. Also, when receiving such a message, it needs to be processed in order.

That being said, next to the checker's thread pool being overloaded, the cluster may be overloaded as well. You can measure this by extracting the metric from /v1/status/ApiListener inside the json_rpc struct. This also is returned with the icinga built-in check. If that's on a steep curve and doesn't lower itself over time, you really have a performance issue.

You can mitigate this even further with analysing the system's performance on context switching, IO and thread consuming for the process - e.g. with the added sar_perf CheckCommand in #7428.

Check Scheduler

Actually, no. Stop blaming the scheduler for everything. It isn't an AI nor does it provide machine learning capabilities. Though it has some intelligent scheduling inside to prevent problematic situations like:

A (re)start/reload causing high load because all checks would run in the same interval and second. There's a safety mechanism in place which spreads the check execution over the first minutes.

Users tend to claim that check results are old after firing a reload, but this is just a spread check execution and won't be visible again after running for some time.

Spreading the check execution, or being influenced by latency from schedule_end/start is controlled by the scheduler in trying to re-establish the expected check interval and to always produce expectable check execution times. In some cases, the included offsets may not be enough for long running plugins (see the timeout and running time remarks above). Then it looks like as if there would be 10k checks running in the same second.

With solving the plugin execution time and making things fast and efficient, the core's scheduler will perform as intended. This also is documented, and if you have a better implementation, feel free to propose this in a detailed concept.

https://icinga.com/docs/icinga2/snapshot/doc/19-technical-concepts/#check-scheduler

Ideas like learning from check execution time and putting them into a low latency group or defined check group are just fine, but need to prove themselves not to cause any other harm.

Also, these ideas are not for a standalone core as we had with 1.x but are related to HA enabled zones and in addition, multiple levels of a distributed environment.

Max Concurrent Checks

The actually running checks differ from what is really possible. The MaxConcurrentChecks setting also takes the workload of the thread pool into account.

It doesn't make sense to put something into the full TP, rather checks are delayed. This also increases the check latency. Therefore this setting may not be reached at all if the underlaying performance suffers.

Conclusion

Users should do the following:

Analyse the environment for possible performance bottlenecks
- https://icinga.com/docs/icinga2/snapshot/doc/15-troubleshooting/#analyse-your-linuxunix-environment
- https://icinga.com/docs/icinga2/snapshot/doc/08-advanced-topics/#monitoring-icinga-2
Identify why checks would always run into timeouts. Find a better implementation for this kind, either by reducing the timeout, or creating checks which do not block the process for quite a long time.
Scale horizontally. Don't try to put 200k services onto a satellite zone with 2 satellites. Create multiple satellite zones sharing the overall load.

Developers will do the following for future releases after 2.11:

Evaluate how plugin execution can be improved
- A standalone embedded Perl PoC exists, yet no benchmarks nor integration into the core. This is too risky for 2.11 anyways.
- Embedded Python was in an experimental state before 2.0.0 was released, this knowledge should be used in this region as well.
Evaluate whether we have the chance to use something different than a global thread pool for executing check tasks. Boost Fibers turned out not possible in a short PoC, this needs more time and resources without any release pressure.
Evaluate standalone check processes

We've agreed on removing this task from 2.11, since the performance compared to older versions does not suffer. One performance problem in an aforementioned environment was caused with the downtime loop which we have mitigated and fixed already.

Therefore I am putting this into 2.12, as ongoing task.

Note

There is absolutely no guarantee that this specific issue gets resolved in this version. We will "fork" specific work tasks from this tracking issue, and carefully decide which problems to tackle and which feature to implement. The issue's state might not always be current too.

dnsmichi commented 5 years ago

Number of Open Files for Pipes

Set the Systemd limits

mkdir /etc/systemd/system/icinga2.service.d
cat >/etc/systemd/system/icinga2.service.d/limits.conf <<EOF
[Service]
LimitNOFILE=50000
LimitNPROC=50000
TasksMax=infinity
EOF

Modify Icinga's RLIMIT

Debian:

vim /etc/default/icinga2

RHEL

vim /etc/sysconfig/icinga2

Content:

ICINGA2_RLIMIT_FILES=50000

Reload Systemd and Icinga

systemctl daemon-reload
systemctl restart icinga2

Verify

for p in $(pidof icinga2); do echo -e "$p\n" && ps -ef | grep $p && echo && cat /proc/$p/limits | grep 'open files' && echo; done

for p in $(pidof icinga2); do echo -e "$p\n" && ps -ef | grep $p && echo && lsof -p $p && echo; done

Icinga / icinga2

Analyse check execution performance #7425

Story

Issues

Problems

Performance

7421 exposes this with current_pending_callbacks as metric.

7416 discusses a problem where the actual running checks are different to what's actually executed as processes. Whenever the thread pool is locked up, no more additional checks can be executed. This is different to the actual running processes though.

Too many open files

Late Check Results

Technical Details

Check Execution Flow

Check Scheduler

Global Task Thread Pool

Asynchronous Check Execution

Generic Task Execution Functions

PluginCheck Task

Prepare and Execute the Command

Enter the Process rabbit hole

Process Spawn Helper

Process Spawn

Actually spawn a process and execute something

The Async Process Event Loop

DoEvents aka do something

Process the check result

Changes

MaxConcurrentChecks

Sandboxed processes

Solutions

Calculate average service latency

Get late service check results

7428 adds a CheckCommand object for the ITL for better performance analysis.

Things tried thus far

Open discussion

Summary CW 35

Latency

Possible root causes

Plugin execution time.

Threads vs CPU power

Check rate

Check Scheduler

Max Concurrent Checks

Conclusion

Number of Open Files for Pipes

Set the Systemd limits

Modify Icinga's RLIMIT

Reload Systemd and Icinga

Verify