grafana / grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
https://grafana.com
GNU Affero General Public License v3.0
64.48k stars 12.07k forks source link

Sigv4 auth in Grafana Tempo `metricsGenerator` config doesn't work with `role_arn` #94734

Open ABHINAV-SUREKA opened 5 days ago

ABHINAV-SUREKA commented 5 days ago

What happened?

In grafana tempo helm-chart, at metricsGenerator.config.storage.remote_write (helm-chart link), if I provide the following:

      storage:
        ...
        remote_write:
        - ...
          sigv4:
            region: eu-west-1
            role_arn: arn:aws:iam::<account_id>:role/<some_role_to_be_assumed>
          url: https://aps-workspaces.eu-west-1.amazonaws.com/workspaces/<amp_workspace_id>/api/v1/remote_write
          ...

I get the error log:

caller=dedupe.go:112 component=remote count=900 err="server returned HTTP status 403 Forbidden: {\"message\":\"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.\\n\\nThe Canonical String for this request should have been\\n'POST\\n/workspaces/<amp_workspace_id>/api/v1/remote_write\\n\\ncontent-encoding:snappy\\ncontent-type:application/x-protobuf\\nhost:aps-workspaces.eu-west-1.amazonaws.com\\nx-amz-date:20241015T143721Z\\nx-amz-security-token:IQoJb3JX2VjEJf//////////wEastMI++sRzRymjZ/IRxIcAmTU2y+RpFmUOvrIMfYa3WiiTSqpAgjw//////////8BEAEaDDYyNTNj3VpbbLv1n+T/OdnUuShWu4C6rojiGmh/lfvgvzhMcOt1a7VTqoW3BmXG1+uA6brESIPJxk+9qoVi9lIxhM6/xkeFVhuhj6Dx14XzGLWDklTs8dw4ptGQs4JOsteb5V9jEO/xtea4wsCAZcIdb6ZhEM3T+WGQC752tybyJZoDV" exemplarCount=0 level=error msg="non-recoverable error" pod=tempo-metrics-generator-xyz-abc tenant=default url=https://aps-workspaces.eu-west-1.amazonaws.com/workspaces/<amp_workspace_id>/api/v1/remote_write

It seems that grafana/tempo metricsGenerator calculates signature based on the Access Key etc. But it fails when only the role_arn has been provided.

_Note: This doesn't seem to be the access issue else the error would have been something like - the role doesn't have permissions to assume the provided rolearn.

What did you expect to happen?

The metrics generator should have written metrics to the provided AMP backend endpoint without any errors.

Did this work before?

No. Testing this out for the first time.

How do we reproduce it?

  1. By just updating metricsGenerator.config.storage.remote_write with the above provided sigv4 configuration.

Is the bug inside a dashboard panel?

No response

Environment (with versions)?

Grafana Tempo: IMAGE_NAME = grafana/tempo IMAGE_VERSION = 2.3.1

Grafana platform?

Kubernetes

Datasource(s)?

No response

tolzhabayev commented 3 days ago

This issue is probably better moved to https://github.com/grafana/tempo/issues fyi @grafana/tempo

joe-elliott commented 2 days ago

That config is a prometheus remote write config:

https://github.com/grafana/tempo/blob/67be243b999adbd572fd85e29eb0ab39a0377f95/modules/generator/storage/config.go#L26

We do some manipulation of it here, but I don't think anything that would impact your auth config:

https://github.com/grafana/tempo/blob/67be243b999adbd572fd85e29eb0ab39a0377f95/modules/generator/storage/instance.go#L93C23-L93C55

The function just adds headers and makes other adjustments based on per tenant configurations:

https://github.com/grafana/tempo/blob/67be243b999adbd572fd85e29eb0ab39a0377f95/modules/generator/storage/config_util.go#L16

Then we feed it into prom remote storage here:

https://github.com/grafana/tempo/blob/67be243b999adbd572fd85e29eb0ab39a0377f95/modules/generator/storage/instance.go#L96

Do we know if prom supports this? It's definitely possible we are doing something that is breaking this type of auth, but it seems like we're using prom remote write code in a fairly straight forward way.