elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
635 stars 98 forks source link

Action monitor/main is unauthorized for API key #580

Closed bartbroere closed 6 months ago

bartbroere commented 1 year ago

When using eland with an Elasticsearch instance that authorizes using an API key, I often get this error:

elasticsearch.AuthorizationException: AuthorizationException(403, "security_exception", "action [cluster:monitor/main] is unauthorized for API key id [...] of user [...], this action is granted by the cluster privileges [monitor,manage,all]",)

This is triggered because some functionality is switched on or off depending on the version of the Elasticsearch you are connecting to.

I can solve it in my code by first making the Elasticsearch connection, and then modifying its attribute _eland_es_version, and set it to the version number I'm using.

import elasticsearch
from eland import DataFrame

elastic_client = elasticsearch.Elasticsearch(
    "https://omitted.us-central1.gcp.cloud.es.io:443",
    api_key="...")

elastic_client._eland_es_version = (8, 9, 1)

df = DataFrame(elastic_client,
               'search-*')

Is there a different way to check the version of the Elasticsearch host, that requires less permissions perhaps?

Here's the code that determines the version number currently: https://github.com/elastic/eland/blob/f14bbaf4b0ed072ce2c74cfb2511c25c3c547cd6/eland/common.py#L318C1-L341

droberts195 commented 1 year ago

Action cluster:monitor/main corresponds to the root REST endpoint of Elasticsearch, i.e. /.

The clients use this to determine whether they're talking to Elasticsearch.

I think the best solution is to give your API keys permission to call this endpoint. Instead of granting a high level cluster privilege that would give wide access you should be able to just add cluster:monitor/main to the cluster privileges you're currently granting. (It's possible to use individual action names in addition to high level privilege names when creating role definitions.)

bartbroere commented 1 year ago

@droberts195 Thanks for your reply. For an API key, setting a wider scope might be a solution. The first time I encountered this issue however, was in a corporate environment, with an Elasticsearch deployment linked to LDAP. In that case, the user might not have the luxury to be able to grant additional permissions.

In #581 I propose a small change that falls back to a default code path if the server's version cannot be determined, instead of raising the error. Maybe this can be a nice solution.