elastic / elasticsearch-php

Official PHP client for Elasticsearch.
https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/index.html
MIT License
35 stars 970 forks source link

Error encoded data query decoding problem #1171

Closed vst93 closed 2 years ago

vst93 commented 3 years ago

Summary of problem or feature request

Similar to this data ud83d\ude4f, the interception of the error before being stored, only the \UD83D section, but PHP's JSON_Decode cannot correctly resolve this data, which will cause Search to throw a fatal error, this You can only give up all data (there may be a lot of queries at a time), or you can only report the error. But I need a batch export, I can't give up other data, so I can only modify the method in the official package.

vendor/elasticsearch/elasticsearch/src/Elasticsearch/Serializers/SmartSerializer.php image image

This is temporarily solved in my problem, but it is still hoped that the official can give a solution or suggestion.

I know that the wrong data must be able to correctly analyze some unreasonable, but the work of multiplayer games is helpless.😮‍💨

The problem is similar to this:https://stackoverflow.com/questions/32799627/elasticsearch-field-showing-as-bad-string-causes-error-when-searched

Code snippet of problem


    if (version_compare(PHP_VERSION, '7.3.0') >= 0) {
            try {
                $result = json_decode($data, true, 512, JSON_THROW_ON_ERROR);
                return $result;
            } catch (\JsonException $e) {
                //遇到反斜杠解析失败的尝试处理
                try {
                    $data = str_replace("\\",'\\\\',$data);
                    $result = json_decode($data, true, 512, JSON_THROW_ON_ERROR);
                    return $result;
                }catch (\JsonException $e){
                    $result = $result ?? [];
                    throw new JsonErrorException($e->getMessage(), $data, $result);
                }
//                $result = $result ?? [];
//                throw new JsonErrorException($e->getMessage(), $data, $result);
            }
        }

System details

ezimuel commented 2 years ago

@vst93 thanks for reporting the issue and a proposal for the fix. If I understood right, you have a UTF-16 data that is returned by Elasticsearch, something like ud83d\ude4f and the json_decode() of PHP is failing because we have an escape character that should not be interpreted. I think this is related with https://bugs.php.net/bug.php?id=62010. The solution that you proposed it sounds good. I would like to do some more tests. I'll send a PR with the fix.

ezimuel commented 2 years ago

@vst93 I just sent the PR https://github.com/elastic/elasticsearch-php/pull/1179 for fixing this issue. Can you check and let me know if this works for you? Thanks!

ezimuel commented 2 years ago

Merged with #1179

vst93 commented 2 years ago

@vst93 I just sent the PR #1179 for fixing this issue. Can you check and let me know if this works for you? Thanks!

Thank you very much. My test can be used