Cacti / cacti

Cacti ™
http://www.cacti.net
GNU General Public License v2.0
1.6k stars 397 forks source link

Increasing the number of processes to a critical level in version 1.2.27 #5748

Closed Ponomarenko50 closed 2 weeks ago

Ponomarenko50 commented 1 month ago

Describe the bug

After the transition from version 1.2.25 to version 1.2.27 in the system, the number of running processes began to gradually increase (see the screenshots). This problem was observed in version 1.2.26.

To Reproduce

Steps to reproduce the behavior:

  1. Go to '...'

  2. Click on '....'

  3. Scroll down to '....'

  4. See error

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

CactiProcesses-v 1 2 27 CactiMemoryUsage-v 1 2 27

About screenshots: Until the time of 20:30, the Cacti version 1.2.25 worked. At 20:30, version 1.2.27 was installed.

Desktop (please complete the following information):

Smartphone (please complete the following information)

Additional context

bmfmancini commented 1 month ago

Do you have a list of what process is sticking around?

You can run for example

ps - ef | grep php ps - ef | grep spine

Any errors in the cacti log?

We will need this info to point to a cause

On Mon, May 13, 2024, 22:55 Ponomarenko50 @.***> wrote:

Describe the bug

After the transition from version 1.2.25 to version 1.2.27 in the system, the number of running processes began to gradually increase (see the screenshots). This problem was observed in version 1.2.26. To Reproduce

Steps to reproduce the behavior:

1.

Go to '...' 2.

Click on '....' 3.

Scroll down to '....' 4.

See error

Expected behavior

A clear and concise description of what you expected to happen. Screenshots

CactiProcesses-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/77bb8c2d-b86b-4f38-aa0b-aee99005cb42 CactiMemoryUsage-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/32a1323a-8749-4853-824e-b34012a63da5

About screenshots: Until the time of 20:30, the Cacti version 1.2.25 worked. At 20:30, version 1.2.27 was installed. Desktop (please complete the following information):

-

OS: Ubuntu Server 18.04.4 LTS

Browser Mozilla Forefox

Cacti version 1.2.27, spine version 1.2.27, MySQL version 5.7.42-0ubuntu0.18.04.1

Smartphone (please complete the following information)

-

Device:

OS: [e.g. iOS8.1]

Browser [e.g. stock browser, safari]

Version [e.g. 22]

Additional context

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5748, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTD5YJKDWHTH2XVHFTDZCF4JHAVCNFSM6AAAAABHVJTXAKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4TIMRUGYZDGNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Ponomarenko50 commented 1 month ago

Hello.

From: Sean Mancini @.> Sent: Tuesday, May 14, 2024 1:42 PM To: Cacti/cacti @.> Cc: Ponomarenko50 @.>; Author @.> Subject: Re: [Cacti/cacti] Increasing the number of processes to a critical level in version 1.2.27 (Issue #5748)

Do you have a list of what process is sticking around?

You can run for example

ps - ef | grep php ps - ef | grep spine

Any errors in the cacti log?

We will need this info to point to a cause

On Mon, May 13, 2024, 22:55 Ponomarenko50 @.***> wrote:

Describe the bug

After the transition from version 1.2.25 to version 1.2.27 in the system, the number of running processes began to gradually increase (see the screenshots). This problem was observed in version 1.2.26. To Reproduce

Steps to reproduce the behavior:

Go to '...'

Click on '....'

Scroll down to '....'

See error

Expected behavior

A clear and concise description of what you expected to happen. Screenshots

CactiProcesses-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/77bb8c2d-b86b-4f38-aa0b-aee99005cb42 CactiMemoryUsage-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/32a1323a-8749-4853-824e-b34012a63da5

About screenshots: Until the time of 20:30, the Cacti version 1.2.25 worked. At 20:30, version 1.2.27 was installed. Desktop (please complete the following information):

OS: Ubuntu Server 18.04.4 LTS

Browser Mozilla Forefox

Cacti version 1.2.27, spine version 1.2.27, MySQL version 5.7.42-0ubuntu0.18.04.1

Smartphone (please complete the following information)

Device:

OS: [e.g. iOS8.1]

Browser [e.g. stock browser, safari]

Version [e.g. 22]

Additional context

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5748, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTD5YJKDWHTH2XVHFTDZCF4JHAVCNFSM6AAAAABHVJTXAKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4TIMRUGYZDGNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, https://github.com/Cacti/cacti/issues/5748#issuecomment-2109218712 view it on GitHub, or https://github.com/notifications/unsubscribe-auth/BIO7LENQCXFIOFHW7J23D5DZCGBXPAVCNFSM6AAAAABHVJTXAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBZGIYTQNZRGI unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Ponomarenko50 commented 1 month ago

The server "fell", the database is destroyed, there is no way to log in. The server is restored to version 1.2.25.

Ponomarenko50 commented 1 month ago

ps_php.log ps_spine.log

Ponomarenko50 commented 1 month ago

Cacti logs have messages: .... 14/May/2024 16:24:02 - SYSTEM MAINT STATS: Time:0.01 * 14/May/2024 16:24:02 - POLLER: Poller[1] PID[23679] WARNING: ### There are 1 processes detected as overrunning a polling cycle, please investigate **** 14/May/2024 16:24:02 - SYSTEM WARNING: Primary Admin account notifications disabled! Unable to send administrative Email. 14/May/2024 16:25:01 - POLLER: Poller[1] PID[23679] Maximum runtime of 58 seconds exceeded. Exiting. 14/May/2024 16:25:01 - SYSTEM WARNING: Primary Admin account notifications disabled! Unable to send administrative Email. 14/May/2024 16:25:01 - SYSTEM STATS: Time:59.1008 Method:spine Processes:25 Threads:20 Hosts:282 HostsPerProcess:12 DataSources:6270 RRDsProcessed:0 * 14/May/2024 16:25:02 - POLLER: Poller[1] PID[26200] WARNING: There are 1 processes detected as overrunning a polling cycle, please investigate ** ...

An increase in the number of processes/flows (25/25) does not lead to changes. With the established version 1.2.25 there were no such messages.

Ponomarenko50 commented 1 month ago

cacti_stderr.log

TheWitness commented 1 month ago

That's a lot of processes for the database to handle. My system has significantly more than 10k hosts and I run with only 5 processes and 30 threads. My database server has 112 threads to keep pace. Your settings are overloading the system.

How many database server cores do you have? How many web server cores?

With the poller shut down, run spine in debug as follows and post the results

./spine -V 3 -S -R
Ponomarenko50 commented 1 month ago

Hello. The Cacti system itself proposed to increase the number of flows (see the log higher). I increased, but nothing has changed. How many flows on the host, the total number of processes/flows (for Spine) do you recommend installing?

My server is a regular computer with a 4 core processor and 8 GB of RAM.

I replaced the Spine version 1.2.27 with version 1.2.25 (version of Cacti 1.2.27), the problem of increasing the number of processes has gone (see screenshots), there are no errors in the cacti logs, there are no errors in cacti_stderr.log.

There are messages in cacti_stderr.log: ... Replacing action 3 Replacing action 4 Replacing action 6 Replacing action 7 Replacing action 2 ... This is fine?

CactiProcesses-v 1 2 27-2 CactiMemoryUsage-v 1 2 27-2

TheWitness commented 1 month ago

That's a lot of processes for the database to handle. My system has more than 10k hosts and I run with only 5 processes and 30 threads. My database server has 112 threads to keep pace. Your settings are overloading the system.

How many database server cores do you have? How many web server cores?

With the poller shut down, run spine in debug as follows and post the results

./spine -V 3 -S -R
Ponomarenko50 commented 1 month ago

Good afternoon. I can’t provide you with a spine log. It has a lot of confidential information. There are no error messages in the log file, at the last stage these messages are displayed:

Total[24.4489] Device[606] HT[2] DS[11922] TT[998.40] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.1, value: 783007776 Total[24.4489] Device[606] HT[2] DS[11922] TT[998.41] SNMP: v2: 192.168.98.94, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.1, value: 44954243 Total[24.4489] Device[606] HT[2] DS[11923] TT[998.42] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.2, value: 53318538 Total[24.4489] Device[606] HT[2] DS[11923] TT[998.43] SNMP: v2: 192.168.98.94, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, value: 800089969 Total[24.4489] Device[606] HT[2] DS[11924] TT[998.44] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.7, value: 85059471 Total[24.4490] Device[606] HT[2] DS[11924] TT[998.46] SNMP: v2: 192.168.98.94, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.7, value: 1579091582 Total[24.4490] Device[606] HT[2] DS[11925] TT[998.47] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.8, value: 36938751 Total[24.4490] Device[606] HT[2] Total Time: 4.3 Seconds corrupted size vs. prev_size 2024/05/17 15:38:57 - FATAL: Spine Interrupted by Abort Signal Aborted (core dumped) xxxxxx@cacti_server:/usr/local/spine/bin# Unable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipe^C

These messages are duplicated in the cacti-stderr.log log-file cacti_stderr.log

TheWitness commented 1 month ago

Can you provide the entirety of the output of the spine run? What is your MariaDB/MySQL setting of max_connections and max_used_connections

SHOW GLOBAL VARIABLES LIKE 'max_connections';
SHOW GLOBAL STATUS LIKE 'max_used_connections';

Those "Replacing action X", do not appear to be in the mainline Cacti. Can you research that?

cd /var/www/html/cacti
grep -r "Replacing"
Ponomarenko50 commented 1 month ago

Hello.

mysql> SHOW GLOBAL VARIABLES LIKE 'max_connections'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | max_connections | 850 | +-----------------+-------+ 1 row in set (0,00 sec)

mysql> SHOW GLOBAL STATUS LIKE 'max_used_connections'; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | Max_used_connections | 550 | +----------------------+-------+ 1 row in set (0,00 sec)

mysql>

Message "Replacing Action X" is not in Cacti, it is located in Spine (only in version 1.2.25) in the error.c file. The problem of increasing the number of connections is not in Cacti, the problem in Spine. He, under certain conditions, does not close the compounds.

TheWitness commented 1 month ago

You should sanitize it and then send a long to TheWitness at Cacti dot net.

TheWitness commented 2 weeks ago

No feedback