Open satwikkansal opened 1 week ago
logfire.instrument_system_metrics
must only be called once. It sets up a loop in a background thread which exports metrics every 60 seconds, and once at the end of the process. The only reason to use a loop is to keep the process alive if it's doing nothing else, e.g.:
logfire.instrument_system_metrics()
while True:
time.sleep(60)
I want to have a separate process altogether to monitor system-wide metrics
I don't know if you really need this as opposed to just also exporting system-wide metrics from your main application processes. But if you do, then the two calls to logfire.instrument_system_metrics
will be in separate processes so there won't be a problem. If you have a process whose only job is to report system-wide metrics then it's not really useful to measure its own process metrics.
If you want to instrument both process and system metrics within a single process, then call instrument_system_metrics
once with a single dict combining both.
Thanks!
Any ideas about the errors below
Traceback (most recent call last):
File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 136, in callback
for api_measurement in callback(callback_options):
File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/instrumentation/system_metrics/__init__.py", line 629, in _get_system_network_io
for metric in self._config["system.network.dropped.packets"]:
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'system.network.dropped.packets'
Callback failed for instrument system.swap.utilization.
Traceback (most recent call last):
File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 136, in callback
for api_measurement in callback(callback_options):
File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/instrumentation/system_metrics/__init__.py", line 500, in _get_system_swap_utilization
for metric in self._config["system.swap.utilization"]:
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'system.swap.utilization'
Callback failed for instrument system.disk.io.
Traceback (most recent call last):
File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 136, in callback
for api_measurement in callback(callback_options):
File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/instrumentation/system_metrics/__init__.py", line 517, in _get_system_disk_io
for metric in self._config["system.disk.io"]:
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
KeyError: 'system.disk.io'
Callback failed for instrument system.network.io.
I still get them
Thanks!
Any ideas about the errors below
Traceback (most recent call last): File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 136, in callback for api_measurement in callback(callback_options): File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/instrumentation/system_metrics/__init__.py", line 629, in _get_system_network_io for metric in self._config["system.network.dropped.packets"]: ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'system.network.dropped.packets'
Reported https://github.com/open-telemetry/opentelemetry-python-contrib/issues/3005
Callback failed for instrument system.swap.utilization. Traceback (most recent call last): File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 136, in callback for api_measurement in callback(callback_options): File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/instrumentation/system_metrics/__init__.py", line 500, in _get_system_swap_utilization for metric in self._config["system.swap.utilization"]: ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'system.swap.utilization' Callback failed for instrument system.disk.io. Traceback (most recent call last): File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 136, in callback for api_measurement in callback(callback_options): File "/Users/satwik/code/freelance/ongoing/cq/loee/venv/lib/python3.11/site-packages/opentelemetry/instrumentation/system_metrics/__init__.py", line 517, in _get_system_disk_io for metric in self._config["system.disk.io"]: ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^ KeyError: 'system.disk.io' Callback failed for instrument system.network.io.
This is not the same kind of mismatch, I can't reproduce these errors if I only call instrument_system_metrics
once. What code did you run?
Added a docs label for us to make it clearer that instrument_system_metrics should only be called once.
import logfire
from dotenv import load_dotenv
import time
load_dotenv()
# System-wide metrics (monitors entire system)
system_metrics = {
# CPU metrics for whole system
'system.cpu.simple_utilization': None,
# System memory usage
'system.memory.utilization': ['available', 'used'],
# Disk I/O for all processes
'system.disk.io': ['read', 'write'],
# Network I/O for all processes
'system.network.io': ['transmit', 'receive'],
# System swap usage
'system.swap.utilization': ['used']
}
# Process-specific metrics (only for your Python application)
process_metrics = {
# CPU usage of this Python process
'process.runtime.cpu.utilization': None,
# Memory usage of this Python process
'process.runtime.memory': ['rss', 'vms'],
# Thread count of this Python process
'process.runtime.thread_count': None,
# File descriptors opened by this process
'process.open_file_descriptor.count': None
}
logfire.configure()
logfire.instrument_system_metrics(system_metrics, base=None)
# logfire.instrument_system_metrics(process_metrics, base=None)
while True:
# needed to keep the process alive
time.sleep(60)
This is my code, you've to probably wait for a couple of minutes for the errors to start showing up.
Operating system: I'm using MacOS 14.1.1, M1 chipset Logfire version: Tried on both 1.01 and 2.3.0
That only gives me KeyError: 'system.network.dropped.packets'
Yes, you're correct, I might have been instrumenting both system_metrics and process_metrics thinking they're mutually exclusive. It's just the KeyError: 'system.network.dropped.packets'
error if I just call instrument_system_metrics once.
Question
I want to use logfire to push some system as well as process metrics, however it feels like the documentation could be more complete.
Looking at the documentation, I added up this code
My goal was
base=None
argument.Is the above way the right way to do so?
While running this code, I get couple of issues
I believe this is occurring because of the while loop, but then again if I don't have the while loop the script just starts and shuts and all I see on my dashboard is single data point.
Am I missing any step or doing something incorrectly?