graphite-project / carbon

Carbon is one of the components of Graphite, and is responsible for receiving metrics over the network and writing them down to disk using a storage backend.
http://graphite.readthedocs.org/
Apache License 2.0
1.5k stars 490 forks source link

[BUG]/var/log/carbon/console.log file is flooded with aggregation metric messages #932

Closed Zorrom closed 2 years ago

Zorrom commented 2 years ago

Please report bug only for carbon component here, if you want to to do so for Graphite, please use graphite-web repo

Describe the bug The /var/log/carboon/console.log file is flooded with the below errors


21/04/2022 12:25:21 :: Couldn't match metric project.quantum.product.vm02.memory.memory-free with any aggregation rule. Passing on un-aggregated.
21/04/2022 12:25:21 :: Couldn't match metric project.quantum.product.vm02.memory.memory-slab_unrecl with any aggregation rule. Passing on un-aggregated.
21/04/2022 12:25:21 :: Couldn't match metric project.quantum.product.vm02.memory.memory-slab_recl with any aggregation rule. Passing on un-aggregated.
21/04/2022 12:25:21 :: Couldn't match metric project.quantum.product.vm02.memory.memory-used with any aggregation rule. Passing on un-aggregated.
21/04/2022 12:25:21 :: Couldn't match metric project.quantum.product.vm02.memory.memory-buffered with any aggregation rule. Passing on un-aggregated.
21/04/2022 12:25:21 :: Couldn't match metric project.quantum.product.vm02.tcpconns-all.tcp_connections-SYN_SENT with any aggregation rule. Passing on un-aggregated.

Message like this are flooding our console logs at a very severe rate and would like to know the cause of these errors. The messages are flooding when we start our carbon-aggregator service

To Reproduce

  1. Install carbon, graphite-web and Python 3.9 . Starat the carbon-aggregator service to view the logs flooding in console.log file carbon - Version: 1.1.8 graphite-web - Version: 1.1.8

Expected behavior

  1. /var/log/carbon/console.log should not flood logs becuase of aggregation rules

Environment (please complete the following information):

Additional context Add any other context about the problem here. Below are my storage-* conf files . Please note that these conf are working fine on our Cenos systems running Python 2

/etc/carbon/storage-aggregation.conf

[root@vm02carbon]# cat storage-aggregation.conf 
[min]
pattern = \.lower$
xFilesFactor = 0.1
aggregationMethod = min

[max]
pattern = \.upper$
xFilesFactor = 0.1
aggregationMethod = max

[sum]
pattern = \.sum$
xFilesFactor = 0
aggregationMethod = sum

[count]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum

[count_legacy]
pattern = ^stats_counts.*
xFilesFactor = 0
aggregationMethod = sum

[default_average]
pattern = .*
xFilesFactor = 0.3
aggregationMethod = average

/etc/carbon/storage-schemas.conf

[root@pcrfclient01 carbon]# cat storage-schemas.conf 
# Schema definitions for Whisper files. Entries are scanned in order,
# and first match wins. This file is scanned for changes every 60 seconds.
#
#  [name]
#  pattern = regex
#  retentions = timePerPoint:timeToStore, timePerPoint:timeToStore, ...

# Carbon's internal metrics. This entry should match what is specified in
# CARBON_METRIC_PREFIX and CARBON_METRIC_INTERVAL settings
[carbon]
pattern = ^carbon\.
retentions = 60s:90d

[andsf-locations]
# catch all ANDSF Location stats
pattern = Andsf_Request_Loc(IN|OUT) 
retentions = 10s:1d,60s:60d

[default]
pattern = .*
retentions = 10s:1d,60s:90d
deniszh commented 2 years ago

This behaviour is controlled by LOG_AGGREGATOR_MISSES parameter in carbon.conf By default it's True, uncomment it, set it to False and restart carbon.

Zorrom commented 2 years ago

@deniszh Thanks for you help. I added the LOG_AGGREGATOR_MISSES in the following section of carbon.conf and the flooding stopped

[aggregator]
USER = carbon
LINE_RECEIVER_INTERFACE = 127.0.0.1
LINE_RECEIVER_PORT = 2023

PICKLE_RECEIVER_INTERFACE = 127.0.0.1
PICKLE_RECEIVER_PORT = 2024

# Set to false to disable logging of successful connections
LOG_LISTENER_CONNECTIONS = True
LOG_AGGREGATOR_MISSES = False

But please confirm whether adding this under [aggregator] is a proper fix and also let me know if adding LOG_AGGREGATOR_MISSES = False, is safe for our environment

Once again thank you very much for your help :)

deniszh commented 2 years ago

LOG_AGGREGATOR_MISSES has global scope, so, it's fine to keep it as is.

Zorrom commented 2 years ago

@deniszh Thanks for the clarification and our system looks fine. I'll close this case