elastic / elasticsearch-php

Official PHP client for Elasticsearch.
https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/index.html
MIT License
5.26k stars 964 forks source link

Strange behaviour #308

Closed manelet closed 7 years ago

manelet commented 8 years ago

I'm trying a simple get, everything works fine, but when I use Sense to run a PUT to update the document, if I try to retrieve the document via php I get an empty response.

Is like it's not synced correctly.

I'm I missing something here?

Thanks!

polyfractal commented 8 years ago

Can you share some code and/or a working reproduction demo of the problem you're seeing? It's likely a mismatch between index/type name, or ID. Or the update itself is incorrect and causing an unexpected behavior.

Or if you are executing the requests quickly back-to-back (such as in an integration test) it could be a refresh timing issue.

Hard to say without seeing more, alas :)

manelet commented 8 years ago

Actually I'm just toying a bit with this delicious thing that ElasticSearch is, but I feel that is weird what's happening to me.

Just one query, no integrations, just to play with it a bit.

Basically, this piece of code:

echo '<h1>PHP Wrapper response</h1>';
    $params = [
        'index' => 'equipos',
        'type' => 'equipo',
        'id' => '1'
    ];
    $response = $client->get($params);
    print_r($response);

    echo '<h1>file_get_contents() response</h1>';
    print_r(file_get_contents('http://localhost:9200/equipos/equipo/1'));

Returns this:

<h1>PHP Wrapper response</h1>
<h1>file_get_contents() response</h1>{"_index":"equipos","_type":"equipo","_id":"1","_version":28,"found":true,"_source":{
    name: "lalala"
}
}

No errors, no exceptions, nothing. To be noted that PHP wrapper returns nothing while the simple file_get_contents returns the document.

Also if I curl via command line or I get it by Sense I have the document returned perfectly.

Sorry for my bad english, writting from Catalonia here ;)

Thanks in advance for your time!

manelet commented 8 years ago

A cool detail is that if I index a document via php-wrapper with the same Id, then yes, I can get the newly created document back, if after this I decide to upgrade it by doing a PUT in my Sense console, then, I can not retrieve it again.

So weird.

polyfractal commented 8 years ago

Hmm, well that is curious indeed.

A cool detail is that if I index a document via php-wrapper with the same Id, then yes, I can get the newly created document back, if after this I decide to upgrade it by doing a PUT in my Sense console, then, I can not retrieve it again.

Can you paste your Sense command here too? I don't have any theory how it is affecting everything, but maybe I'll see something :)

Sorry for my bad english, writting from Catalonia here ;)

No worries, your english is great! I actually just got home from a vacation in Barcelona, and I can assure you that your English is better than my Catalan :P

manelet commented 8 years ago

Here I go!

What version of ES-PHP are you using?

In composer I use: "elasticsearch/elasticsearch": "~2.0"

Version of PHP?

PHP 5.6.14

Can you show how you are constructing the $client object? Perhaps there is something wrong and the exception is being hidden by Apache/nginx?

Sure, I'm using SlimPHP, super simple php router, before creating the $app object:

$client = ClientBuilder::create()
    ->setHosts(array('http://localhost:9200'))
    ->build();

Can you paste your Sense command here too? I don't have any theory how it is affecting everything, but maybe I'll see something :)

Of course, I might be doing something wrong, it's just my second day playing with ES:

PUT equipos/equipo/1
{
    name: "Testtttt"
}

To get it,

GET equipos/equipo/1

Thanks again! I hope you enjoyed Barcelona, it's awsome!

manelet commented 8 years ago

So....

Now I've created 10 documents with random name via php, I can get any document. But again, as soon as a document is updated via non-php i can not be retrieve it anymore. Meanwhile the other un-updated generated documents still can be retrieved if they haven't been updated via non-php.

I'll keep digging.

polyfractal commented 8 years ago

I just installed Slim and will play around to see if I can recreate your problem. You're capturing the client object in the closure, right? E.g.

$client = ClientBuilder::create()
    ->setHosts(array('http://localhost:9200'))
    ->build();

$app = new \Slim\Slim();
$app->get('/', function () use ($client) {  // capture $client
    echo "Hello";
    print_r($client->search());
});
manelet commented 8 years ago

If you execute GET equipos/_search do you see all the documents (updated or otherwise) ?

Yes, I can see all test documents I've been indexing/updating from Sense or command-line. Otherwise if I perform a ->search() I get an empty result.

Have you changed any of the index settings, like the refresh interval?

Nope

What version of ES itself are you using?

I believe it's ## Release 2.0.2 since it's in the first line of changelog.md (sorry dunno how to get the version)

I just installed Slim and will play around to see if I can recreate your problem. You're capturing the client object in the closure, right? E.g.

You are completly right, that's how I do it.

It's weird, because if I print_r the $response variable in Client.php, line 157, in ->get() function, just before the return I get a Guzzle object with a 200 response, but without the content. I paste it here:

    [waitfn:GuzzleHttp\Ring\Future\FutureArray:private] => Array
        (
            [0] => GuzzleHttp\Ring\Future\CompletedFutureArray Object
                (
                    [result:protected] => Array
                        (
                            [transfer_stats] => Array
                                (
                                    [url] => http://localhost:9200/equipos/equipo/3
                                    [content_type] => application/json; charset=UTF-8
                                    [http_code] => 200
                                    [header_size] => 87
                                    [request_size] => 56
                                    [filetime] => -1
                                    [ssl_verify_result] => 0
                                    [redirect_count] => 0
                                    [total_time] => 0.001665
                                    [namelookup_time] => 3.4E-5
                                    [connect_time] => 0.000328
                                    [pretransfer_time] => 0.000361
                                    [size_upload] => 0
                                    [size_download] => 105
                                    [speed_download] => 63063
                                    [speed_upload] => 0
                                    [download_content_length] => 105
                                    [upload_content_length] => -1
                                    [starttransfer_time] => 0.001608
                                    [redirect_time] => 0
                                    [redirect_url] => 
                                    [primary_ip] => 127.0.0.1
                                    [certinfo] => Array
                                        (
                                        )

                                    [primary_port] => 9200
                                    [local_ip] => 127.0.0.1
                                    [local_port] => 50071
                                    [error] => 
                                    [errno] => 0
                                )

                            [curl] => Array
                                (
                                    [error] => 
                                    [errno] => 0
                                )

                            [effective_url] => http://localhost:9200/equipos/equipo/3
                            [headers] => Array
                                (
                                    [Content-Type] => Array
                                        (
                                            [0] => application/json; charset=UTF-8
                                        )

                                    [Content-Length] => Array
                                        (
                                            [0] => 105
                                        )

                                )

                            [version] => 1.1
                            [status] => 200
                            [reason] => OK
                            [body] => Resource id #180
                        )

                    [error:protected] => 
                    [cachedPromise:GuzzleHttp\Ring\Future\CompletedFutureValue:private] => React\Promise\FulfilledPromise Object
                        (
                            [value:React\Promise\FulfilledPromise:private] => Array
                                (
                                    [transfer_stats] => Array
                                        (
                                            [url] => http://localhost:9200/equipos/equipo/3
                                            [content_type] => application/json; charset=UTF-8
                                            [http_code] => 200
                                            [header_size] => 87
                                            [request_size] => 56
                                            [filetime] => -1
                                            [ssl_verify_result] => 0
                                            [redirect_count] => 0
                                            [total_time] => 0.001665
                                            [namelookup_time] => 3.4E-5
                                            [connect_time] => 0.000328
                                            [pretransfer_time] => 0.000361
                                            [size_upload] => 0
                                            [size_download] => 105
                                            [speed_download] => 63063
                                            [speed_upload] => 0
                                            [download_content_length] => 105
                                            [upload_content_length] => -1
                                            [starttransfer_time] => 0.001608
                                            [redirect_time] => 0
                                            [redirect_url] => 
                                            [primary_ip] => 127.0.0.1
                                            [certinfo] => Array
                                                (
                                                )

                                            [primary_port] => 9200
                                            [local_ip] => 127.0.0.1
                                            [local_port] => 50071
                                            [error] => 
                                            [errno] => 0
                                        )

                                    [curl] => Array
                                        (
                                            [error] => 
                                            [errno] => 0
                                        )

                                    [effective_url] => http://localhost:9200/equipos/equipo/3
                                    [headers] => Array
                                        (
                                            [Content-Type] => Array
                                                (
                                                    [0] => application/json; charset=UTF-8
                                                )

                                            [Content-Length] => Array
                                                (
                                                    [0] => 105
                                                )

                                        )

                                    [version] => 1.1
                                    [status] => 200
                                    [reason] => OK
                                    [body] => Resource id #180
                                )

                        )

                )

            [1] => wait
        )

    [cancelfn:GuzzleHttp\Ring\Future\FutureArray:private] => Array
        (
            [0] => GuzzleHttp\Ring\Future\CompletedFutureArray Object
                (
                    [result:protected] => Array
                        (
                            [transfer_stats] => Array
                                (
                                    [url] => http://localhost:9200/equipos/equipo/3
                                    [content_type] => application/json; charset=UTF-8
                                    [http_code] => 200
                                    [header_size] => 87
                                    [request_size] => 56
                                    [filetime] => -1
                                    [ssl_verify_result] => 0
                                    [redirect_count] => 0
                                    [total_time] => 0.001665
                                    [namelookup_time] => 3.4E-5
                                    [connect_time] => 0.000328
                                    [pretransfer_time] => 0.000361
                                    [size_upload] => 0
                                    [size_download] => 105
                                    [speed_download] => 63063
                                    [speed_upload] => 0
                                    [download_content_length] => 105
                                    [upload_content_length] => -1
                                    [starttransfer_time] => 0.001608
                                    [redirect_time] => 0
                                    [redirect_url] => 
                                    [primary_ip] => 127.0.0.1
                                    [certinfo] => Array
                                        (
                                        )

                                    [primary_port] => 9200
                                    [local_ip] => 127.0.0.1
                                    [local_port] => 50071
                                    [error] => 
                                    [errno] => 0
                                )

                            [curl] => Array
                                (
                                    [error] => 
                                    [errno] => 0
                                )

                            [effective_url] => http://localhost:9200/equipos/equipo/3
                            [headers] => Array
                                (
                                    [Content-Type] => Array
                                        (
                                            [0] => application/json; charset=UTF-8
                                        )

                                    [Content-Length] => Array
                                        (
                                            [0] => 105
                                        )

                                )

                            [version] => 1.1
                            [status] => 200
                            [reason] => OK
                            [body] => Resource id #180
                        )

                    [error:protected] => 
                    [cachedPromise:GuzzleHttp\Ring\Future\CompletedFutureValue:private] => React\Promise\FulfilledPromise Object
                        (
                            [value:React\Promise\FulfilledPromise:private] => Array
                                (
                                    [transfer_stats] => Array
                                        (
                                            [url] => http://localhost:9200/equipos/equipo/3
                                            [content_type] => application/json; charset=UTF-8
                                            [http_code] => 200
                                            [header_size] => 87
                                            [request_size] => 56
                                            [filetime] => -1
                                            [ssl_verify_result] => 0
                                            [redirect_count] => 0
                                            [total_time] => 0.001665
                                            [namelookup_time] => 3.4E-5
                                            [connect_time] => 0.000328
                                            [pretransfer_time] => 0.000361
                                            [size_upload] => 0
                                            [size_download] => 105
                                            [speed_download] => 63063
                                            [speed_upload] => 0
                                            [download_content_length] => 105
                                            [upload_content_length] => -1
                                            [starttransfer_time] => 0.001608
                                            [redirect_time] => 0
                                            [redirect_url] => 
                                            [primary_ip] => 127.0.0.1
                                            [certinfo] => Array
                                                (
                                                )

                                            [primary_port] => 9200
                                            [local_ip] => 127.0.0.1
                                            [local_port] => 50071
                                            [error] => 
                                            [errno] => 0
                                        )

                                    [curl] => Array
                                        (
                                            [error] => 
                                            [errno] => 0
                                        )

                                    [effective_url] => http://localhost:9200/equipos/equipo/3
                                    [headers] => Array
                                        (
                                            [Content-Type] => Array
                                                (
                                                    [0] => application/json; charset=UTF-8
                                                )

                                            [Content-Length] => Array
                                                (
                                                    [0] => 105
                                                )

                                        )

                                    [version] => 1.1
                                    [status] => 200
                                    [reason] => OK
                                    [body] => Resource id #180
                                )

                        )

                )

            [1] => cancel
        )

    [wrappedPromise:GuzzleHttp\Ring\Future\FutureArray:private] => React\Promise\FulfilledPromise Object
        (
            [value:React\Promise\FulfilledPromise:private] => 
        )

    [error:GuzzleHttp\Ring\Future\FutureArray:private] => 
    [result:GuzzleHttp\Ring\Future\FutureArray:private] => 
    [isRealized:GuzzleHttp\Ring\Future\FutureArray:private] => 
)
polyfractal commented 8 years ago

I believe it's ## Release 2.0.2 since it's in the first line of changelog.md (sorry dunno how to get the version)

Oh, sorry, I meant which version of Elasticsearch the server, not the client.

It's weird, because if I print_r the $response variable in Client.php, line 157, in ->get() function, just before the return I get a Guzzle object with a 200 response, but without the content. I paste it here

Yeah, if you look at the object, the body field is a "Resource ID":

[body] => Resource id #180

RingPHP uses PHP's memory streams to hold the body, so the body will look like that Resource ID until it is read from the stream (which is handled by the client).

I'm really not sure what's going on here. It appears the request is being executed successfully based on that output that you attached: the response code is correct, the response body has length, etc. I can't seem to replicate the problem, all scenarios have worked on my end.

Would it be possible to zip up your entire project so I can run it locally? If there is something sensitive in it you don't want to share (company name/details or whatever), or dont have a place to host it, you can email it to me: zach@elastic.co

polyfractal commented 8 years ago

Also, try running in Slim's debugging mode to see if there is an exception?

$app = new \Slim\Slim(array(
    'debug' => true
));
polyfractal commented 8 years ago

Also also, try to access a property of the document directly, to make sure the Promise is resolved:

$params = [
  'index' => 'equipos',
  'type' => 'equipo',
  'id' => '1'
];
$response = $client->get($params);

// Accessing _source to resolve promise, and using var_dump as an alternative
var_dump($response['_source']);
manelet commented 8 years ago

Ok Zach! I'll try everything when I get home, I'm now at the office.

I also will try to test everything in a remote machine just to see if I have the same behaviour.

Don't know if could be an issue, I'm in macbookpro by using php/apache from macports.

Thanks for your time!

polyfractal commented 8 years ago

Sounds good, thanks for helping to debug this. Since the client appears to be work sometimes, I suspect it is some kind of interaction between the client and Slim, a middleware handler, etc. Just need to narrow down the possibilities now :)

manelet commented 8 years ago

Also, try running in Slim's debugging mode to see if there is an exception?

Debugging running, no exceptions, I also tried to execute the ES client out of the Slim context, and I have the same result.

manelet commented 8 years ago

I'm starting to think that might be something related with my Mac/Macports/PHP-apache installation.

Now, I'll try to install it in my online server and tomorrow in my laptop at the office, just to see what happens.

I'll keep on fighting!!!

manelet commented 8 years ago

Also also, try to access a property of the document directly, to make sure the Promise is resolved:

This returns NULL.

I'll zip the files and send them to you aswell.

polyfractal commented 8 years ago

Thanks for the additional details. Got your zip, will give it a test run tomorrow and see if I can replicate it.

For the record, I'm also on a mac, but not using Macports (phpbrew for PHP versions + php-fpm, homebrew for nginx). If I can't get it to replicate, I'll install Apache and see if it is perhaps something associated with that.

Do you see anything in your Apache logs, perhaps a permission problem? RingPHP uses fopen() to open temporary memory streams used for the body, perhaps that is locked down which is why the body is empty?

Are you using mod_php or php-fpm with Apache?

Can you confirm only one instance of Elasticsearch is running? If you have two, non-clustered nodes on a single machine and accidentally talk to both you could see different sets of results

This is definitely an odd issue since it works sometimes-but-not-other! I would expect a configuration problem to always fail :s

polyfractal commented 8 years ago

So, I ran your code with a few minor modifications, and everything works alright. I added some code to delete the index to start with a fresh slate, and execute a refresh after indexing. I don't think these will affect your problem (e.g. refresh doesn't affect realtime GET requests), but you can give it a shot.

Step 1

So first I ran this:

<?php

require 'config.inc.php';

$app->get('/', function() use ($app, $em, $client) {

    // Delete the existing index and wait for cluster health
    // to return to yellow
    $client->indices()->delete([
        'index' => 'equipos'
    ]);

    $client->cluster()->health([
        'wait_for_status' => 'yellow'
    ]);

    // Index the documents
    for($i=1; $i<11; $i++){
        $params = [
            'index' => 'equipos',
            'type' => 'equipo',
            'id' => $i,
            'body' => ['name' => substr(md5(rand()), 0, 7)]
        ];
        $client->index($params);
    }

    // Refresh the index after indexing.  Not necessary for GETs, but
    // can't hurt
    $client->indices()->refresh();

    // Verify all the results look correct
    for($i=1; $i<11; $i++){
        echo '<b>PHP Wrapper response</b><br/>';
        $params = [
            'index' => 'equipos',
            'type' => 'equipo',
            'id' => $i
        ];
        $response = $client->get($params);
        var_dump($response['_source']);

        echo '<br/><br/>';
        echo '<b>file_get_contents() response</b><br/>';
        print_r(file_get_contents("http://localhost:9200/equipos/equipo/$i"));
        echo '<br/><hr></hr><br/>';
    }

});

$app->run();

Which outputs this:

Notice: Use of undefined constant ROUTES_PATH - assumed 'ROUTES_PATH' in /Users/tongz/Downloads/camisetasnba/config.inc.php on line 18
PHP Wrapper response
array(1) { ["name"]=> string(7) "300d065" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"1","_version":1,"found":true,"_source":{"name":"300d065"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "f8dbf18" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"2","_version":1,"found":true,"_source":{"name":"f8dbf18"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "bbc7bd7" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"3","_version":1,"found":true,"_source":{"name":"bbc7bd7"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "4712767" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"4","_version":1,"found":true,"_source":{"name":"4712767"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "10c7f00" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"5","_version":1,"found":true,"_source":{"name":"10c7f00"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "1909b60" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"6","_version":1,"found":true,"_source":{"name":"1909b60"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "c66952d" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"7","_version":1,"found":true,"_source":{"name":"c66952d"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "0079205" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"8","_version":1,"found":true,"_source":{"name":"0079205"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "7f065eb" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"9","_version":1,"found":true,"_source":{"name":"7f065eb"}}

________________________________________

PHP Wrapper response
array(1) { ["name"]=> string(7) "b8c1988" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"10","_version":1,"found":true,"_source":{"name":"b8c1988"}}

Step 2

Then I executed this in Sense:

PUT equipos/equipo/1
{
    "name": "test"
}
{
   "_index": "equipos",
   "_type": "equipo",
   "_id": "1",
   "_version": 2,
   "created": false
}

Step 3

Finally, I executed this:

<?php

require 'config.inc.php';

$app->get('/', function() use ($app, $em, $client) {

    /*
    // Delete the existing index and wait for cluster health
    // to return to yellow
    $client->indices()->delete([
        'index' => 'equipos'
    ]);

    $client->cluster()->health([
        'wait_for_status' => 'yellow'
    ]);

    // Index the documents
    for($i=1; $i<11; $i++){
        $params = [
            'index' => 'equipos',
            'type' => 'equipo',
            'id' => $i,
            'body' => ['name' => substr(md5(rand()), 0, 7)]
        ];
        $client->index($params);
    }
    */

    // Refresh the index after indexing.  Not necessary for GETs, but
    // can't hurt
    $client->indices()->refresh();

    // Verify all the results look correct
    for($i=1; $i<11; $i++){
        echo '<b>PHP Wrapper response</b><br/>';
        $params = [
            'index' => 'equipos',
            'type' => 'equipo',
            'id' => $i
        ];
        $response = $client->get($params);
        var_dump($response['_source']);

        echo '<br/><br/>';
        echo '<b>file_get_contents() response</b><br/>';
        print_r(file_get_contents("http://localhost:9200/equipos/equipo/$i"));
        echo '<br/><hr></hr><br/>';
    }

});

$app->run();

Which returns the correct output ... the name has changed to "test" while the rest of the documents remain unaffected:

Notice: Use of undefined constant ROUTES_PATH - assumed 'ROUTES_PATH' in /Users/tongz/Downloads/camisetasnba/config.inc.php on line 18
PHP Wrapper response
array(1) { ["name"]=> string(4) "test" } 

file_get_contents() response
{"_index":"equipos","_type":"equipo","_id":"1","_version":2,"found":true,"_source":{ "name": "test" } }

________________________________________

... Other documents omitted, but they were normal ...

Sooo...I think there is something wonky about your setup:

I'm leaning towards a permissions or memory problem with Apache, since it seems to be executing the GETs fine, but the body is null. So perhaps it is failing to allocate the temp memory stream due to permission or configuration problem?

manelet commented 8 years ago

I'm sure it must be one of these two:

  • Check your Apache settings to make sure there isn't something strange going on, like too little allocated memory, security changes, etc
  • Verify there are no Apache errors, permission problems
  • ps aux | grep ElasticSearchreturns only one line
  • Sense is talking to localhost:9200 as PHP is.
  • Following exactly your steps, returns NULL on the updated document via Sense in the PHP Wrapper block

And by the way:

I'll spend tomorrow on checking all apache related stuff, it's making me crazyyyy

Thanks a lot!

manelet commented 8 years ago

I've check everything I know how to checkit, I'm actually out of ideas, I just posted the issue on stackoverflow (http://stackoverflow.com/questions/33317379/strange-behaviour-of-elasticsearch-with-php-wrapper) to ask for more help. Keep you updated.

I'm sure it's not about elasticsearch-php code, I just tried Elastica and it has the same behaviour, so it must be either Apache or PHP I gues...

polyfractal commented 7 years ago

Closing due to inactivity. Hopefully you found the answer to this! :)