ruflin / Elastica

Elastica is a PHP client for elasticsearch
http://elastica.io/
MIT License
2.26k stars 733 forks source link

MoreLikeThis->setLike() should support list of documents #1636

Open pySilver opened 5 years ago

pySilver commented 5 years ago

Hi,

First of all there is a wrong signature of a function:

https://github.com/ruflin/Elastica/blob/master/lib/Elastica/Query/MoreLikeThis.php#L33

    /**
     * Set the "like" value.
     *
     * @param string|self $like
     *
     * @return $this
     */
    public function setLike($like): self
    {
        return $this->setParam('like', $like);
    }

$like obviously cannot be instance of MoreLikeThis it should be string|Document|Document[]

Secondary, according to https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.html list of documents inside like are perfectly supported. Sadly toArray() renders a little bit wrong structure when arrays are used, so query fails due to existence of _source field in request.

Sample failing request:

$mltQuery = new MoreLikeThis();
$mltQuery->setFields(array_values($fields));

$documents = [];
foreach ($ids as $id) {
    $documents[] = new Document(
    $id,
    [],
    $index_params['type'],
    $index_params['index']
    );
}
$mltQuery->setLike($documents);
$mltQuery->setMaxQueryTerms(3);
$mltQuery->setMinDocFrequency(1);
$mltQuery->setMinTermFrequency(1);
{
    "query": {
        "more_like_this": {
            "fields": [
                "completion_terms",
                "suggestion_terms"
            ],
            "like": [
                {
                    "_id": "1057:en",
                    "_type": "products",
                    "_index": "products",
                    "_source": []
                }
            ],
            "max_query_terms": 3,
            "min_doc_freq": 1,
            "min_term_freq": 1
        }
    }
}

Response:

{
    "error": {
        "root_cause": [
            {
                "type": "parse_exception",
                "reason": "failed to parse More Like This item. unknown field [_source]"
            }
        ],
        "type": "parse_exception",
        "reason": "failed to parse More Like This item. unknown field [_source]"
    },
    "status": 400
}

Quick workaround would be using plain arrays as this:

$documents = [];
    foreach ($ids as $id) {
      $documents[] = [
        '_id'    => $id,
        '_index' => $index_params['index'],
        '_type'  => $index_params['type'],
      ];
    }
    $mltQuery->setLike($documents);

so it kinda can be market as Wont Fix (and just update method signature to something like: string|array|Document

ruflin commented 5 years ago

Thanks for all the details. Could you open a PR for the docs fix?

We should also fix the toArray parsing (if possible).