InvalidArgumentException in monitor-aurora-with-grafana lamda

syndic8-joe commented 10 months ago

I implemented the monitor-aurora-with-grafana script as described without issue (all 4 steps in the CF stack were successful), however each lamda invocation includes this error:

[ERROR] InvalidArgumentException: An error occurred (InvalidArgumentException) when calling the GetResourceMetrics operation: This group is not a known group: db.application is not valid for current resourceTraceback (most recent call last): File "/var/task/lambda_function.py", line 33, in lambda_handler pi_response = get_db_resource_metrics(instance) File "/var/task/lambda_function.py", line 78, in get_db_resource_metrics response = pi_client.get_resource_metrics( File "/var/runtime/botocore/client.py", line 530, in _api_call return self._make_api_call(operation_name, kwargs) File "/var/runtime/botocore/client.py", line 960, in _make_api_call raise error_class(parsed_response, operation_name) | [ERROR] InvalidArgumentException: An error occurred (InvalidArgumentException) when calling the GetResourceMetrics operation: This group is not a known group: db.application is not valid for current resource Traceback (most recent call last): File "/var/task/lambda_function.py", line 33, in lambda_handler pi_response = get_db_resource_metrics(instance) File "/var/task/lambda_function.py", line 78, in get_db_resource_metrics response = pi_client.get_resource_metrics( File "/var/runtime/botocore/client.py", line 530, in _api_call return self._make_api_call(operation_name, kwargs) File "/var/runtime/botocore/client.py", line 960, in _make_api_call raise error_class(parsed_response, operation_name)

and no metrics are published to CloudWatch.

I have confirmed the region is correct, and we have multiple databases in this region with Performance Insights enabled.

kshammai commented 10 months ago

I managed to resolve it by removing the invalid groups from dbSliceGroup (line 25 - sandbox/monitor-aurora-with-grafana/function/lambda_function.py)

In my scenario, I made the following change:

Originally:

dbSliceGroup = { "db.sql_tokenized", "db.application", "db.wait_event", "db.user", "db.session_type", "db.host", "db", "db.application" }

Changed to:

dbSliceGroup = { "db.wait_event", "db.user", "db.host", "db" }

LorenzoRogai commented 6 months ago

This is happening because the AWS Guide talk about Aurora PostgreSQL. We have a MySQL cluster and this is happening also for us, the above workaround works fine. You can however put the "db.sql_tokenized" metric again. That is correctly recognized

aws-observability / observability-best-practices

InvalidArgumentException in monitor-aurora-with-grafana lamda #108