ruflin / Elastica

Elastica is a PHP client for elasticsearch
http://elastica.io/
MIT License
2.26k stars 737 forks source link

"no permissions for []" for queries against aliases API #2063

Open ThibautSF opened 2 years ago

ThibautSF commented 2 years ago

Hi,

I have an error that I didn't manage to understand (and might not be completely related to Elastica, or maybe I have an issue with a config somewhere...).

Intro

I have the following ES environment :

For PHP :

The issue

Every other queries (index creation, indexation, search, etc) works but when I'm trying to remove & create aliases I get an 500

PHP Fatal error: Uncaught Elastica\Exception\ResponseException: no permissions for [] and User [name=elsadmin, backend_roles=[], requestedTenant=null] in vendor\ruflin\elastica\src\Transport\Http.php:178

Note: the user elsadmin indicated is admin with full access to everything

The client is initialized with config

        [
            // Server
            'servers' => [
                [
                    'host' => '<host>',
                    'port' => 443,
                    'transport' => 'Https',
                    'username' => 'elsadmin',
                    'password' => '<password>',
                    'connectTimeout' => 10,
                ],
            ],
            // + some other options related to elastically
        ]

Alias query call

$data = ['actions' => []];

$data['actions'][] = ['remove' => ['index' => '*', 'alias' => $indexAlias]];
$data['actions'][] = ['add' => ['index' => $index->getName(), 'alias' => $indexAlias]];

$elasticaClient->request('_aliases', Request::POST, $data);
ruflin commented 2 years ago

I remember AWS OpenSearch had some different auth mechanisms but as normal operations do work, not sure if this is related. It seems you don't get much about in the logs, I would look at the logs of Elasticsearch / OpenSearch to see if you see something there.

Side note: Elastica does not support and is not tested against OpenSearch but in 7.10 I would expect the APIs to still be mostly aligned.

ThibautSF commented 2 years ago

I remember AWS OpenSearch had some different auth mechanisms but as normal operations do work, not sure if this is related. It seems you don't get much about in the logs, I would look at the logs of Elasticsearch / OpenSearch to see if you see something there.

I will search how to activate logs then because looks like they aren't on by default on AWS. But the strange things I found :

Side note: Elastica does not support and is not tested against OpenSearch but in 7.10 I would expect the APIs to still be mostly aligned.

This is for this specific reason I set up an Elasticsearch 7.10 and not OpenSearch 1.2 I wanted to test my implementation with normal ElasticSearch first (then in a second time try an OpenSearch instance) image

ruflin commented 2 years ago

The bulk request issue is odd. I would have expected a different error if the bulk request is too large. Do you have lots of other traffic on this instance? As you said, ES logs should help you in this scenario too.

You also not above, the elsadmin is a super user so I don't see how you should get problems with aliases :-( Have you tried to just run Elasticsearch locally on your machine and run the same code to see what happens?

ThibautSF commented 2 years ago

Still trying to obtain AWS logs, but it doesn't give anything yet.

You also not above, the elsadmin is a super user so I don't see how you should get problems with aliases :-( Have you tried to just run Elasticsearch locally on your machine and run the same code to see what happens?

Local run works fine (although I only have HTTP and no users...)

I tried queries with postman to https://elsadmin:<pass>@<elasticdomain>:443/_aliases

And if I use the "*" wildcard in the remove action it generates the "no permissions for []" error image

BUT if I use direct old indice name : image

So looks like it confirms the fact that the issue is AWS side... I also made tests (but several months ago) with Elastic Cloud 30days demo instance and almost same code (and strictly same code for alias query part) was working on it.

The bulk request issue is odd. I would have expected a different error if the bulk request is too large. Do you have lots of other traffic on this instance? As you said, ES logs should help you in this scenario too.

I still need to deep down that part, but.. even if it's the smallest AWS instance available for OpenSearch (because it's only for dev tests) my local can handle more documents (by number and byte size) at one time with less RAM and nodes (1Go for 1 node (1 shard) against 2Go/nodes for 3 nodes (3 shards)) image (AWS cluster from Elasticview firefox addon view)

ThibautSF commented 2 years ago

Ok managed to find a workaround. I made some more documentation read for _aliases I suppose that the wildcard * was affecting hidden protected indices (security & co).

Since the implementation just adds the base name of the indices as alias I will override the class method in elastically and change the pattern from "*" to "my_index_base_name*"

image

And this way it works.

ruflin commented 2 years ago

Glad you found a workaround:

I suppose that the wildcard * was affecting hidden protected indices (security & co).

I remember there was a bug in some of the 7.x releases in Elasticsearch. Would be interesting to know if it works with 7.17.

On the bulk request: Even though your instances are small and above 80% memory, I would still expect you can ingest more then 5 docs in a bulk. Are these especially large docs?

ThibautSF commented 2 years ago

I remember there was a bug in some of the 7.x releases in Elasticsearch. Would be interesting to know if it works with 7.17.

Sadly AWS OpenSearch service is limited to 7.10, after that it's OpenSearch 1.2 (or manual cluster creation). But I should be able to try older versions image

On the bulk request: Even though your instances are small and above 80% memory, I would still expect you can ingest more then 5 docs in a bulk. Are these especially large docs?

Those are not large docs, those are docs containing attachments files (pdf, images, excel, etc...). Each bulk is created based on 2 metrics :

In my practice case, the 6 documents sent are below 1MB:

ruflin commented 2 years ago

Those are not large docs, those are docs containing attachments files (pdf, images, excel, etc...).

Agree these are not large documents but still different from just JSON payload. If you don't use the attachments, to larger bulk requests go through?

ThibautSF commented 2 years ago

Hi,

Took some time and fixes on my index requests script and made several tests + debugs. And on each tests I reduced my upload max byte size parameter here is for 20MB

Flush 1
Queue size : 1/100
Queue bytes size : 8586291/20000000
array(4) { ["took"]=> int(513) ["ingest_took"]=> int(8662) ["errors"]=> bool(false) ["items"]=> array(1) { [0]=> array(1) { ["index"]=> array(9) { ["_index"]=> string(97) "45a88cff4bbf0973e254c6e87c0a971a76d812187f3376aa04eb9d121756b031_eln_pageattach_2022-05-03-111557" ["_type"]=> string(4) "_doc" ["_id"]=> string(64) "59f855d347347c5fc730fed4bce741e255f07ec0d4d5f0d466659c0abc9f25c3" ["_version"]=> int(1) ["result"]=> string(7) "created" ["_shards"]=> array(3) { ["total"]=> int(1) ["successful"]=> int(1) ["failed"]=> int(0) } ["_seq_no"]=> int(2) ["_primary_term"]=> int(1) ["status"]=> int(201) } } } }
Flush 2
Queue size : 1/100
Queue bytes size : 15550231/20000000
array(1) { ["message"]=> string(28) "429 Too Many Requests /_bulk" }
Flush 3
Queue size : 10/100
Queue bytes size : 16879040/20000000
array(1) { ["message"]=> string(28) "429 Too Many Requests /_bulk" }

And if I reduce it to 15MB all requests work because :

But I need to recheck how my requests are created. Because the "Flush 2" which is 15,5MB contains a file of 8,8MB. And encoded base64 it should take around 11,7MB (33% increase). Se even if I pack some more data with the file, which are really basic data like file name, some ids... 3-4MB difference looks huge so I might have some unwanted data sent.

OR it's my method to calculate the bulk size which is wrong...

    getBytesSize($currentBulk->getActions()); //Note: see edit bellow

    function getBytesSize($arr): int
    {
        $tot = 0;

        if (is_array($arr)) {
            foreach ($arr as $a) {
                $tot += getBytesSize($a);
            }
        }
        if (is_string($arr)) {
            $tot += strlen($arr);
        }
        if (is_int($arr)) {
            $tot += PHP_INT_SIZE;
        }
        if (is_object($arr)) {
            $tot += strlen(serialize($arr));
        }

        return $tot;
    }

EDIT : OK calling getBytesSize((string) $currentBulk); instead looks to give a much better approximation value of the plyload.