ruflin / Elastica

Elastica is a PHP client for elasticsearch
http://elastica.io/
MIT License
2.26k stars 734 forks source link

setTerms no longer normalize array parameters in 7.x #2139

Open pandafox opened 1 year ago

pandafox commented 1 year ago

in 6.x https://github.com/ruflin/Elastica/blob/6.x/lib/Elastica/Query/Terms.php#L53

    public function setTerms($key, array $terms)
    {
        $this->_key = $key;
        $this->_terms = array_values($terms);

        return $this;
    }

in 7.x https://github.com/ruflin/Elastica/blob/7.x/src/Query/Terms.php#L42

    public function setTerms(array $terms): self
    {
        return $this->setParam($this->field, $terms);
    }

    public function setParams(array $params)
    {
        $this->_params = $params;

        return $this;
    }

Which no longer normalize the array.

In normal usage of regular arrays there's no issue, but when an array elements were removed the terms generated aren't recognized properly by ES and results in failed to parse field error

ex:

$array = array(0, 1, 2, 3);php > $array = [0, 123, 456];;
php > unset($array[0]);
php > var_dump($array);
php shell code:1:
array(2) {
  [1] =>
  int(123)
  [2] =>
  int(456)
}

// query output {"1":123,"2":456}

php > var_dump(array_values($array));
php shell code:1:
array(2) {
  [0] =>
  int(123)
  [1] =>
  int(456)
}

// query output [123,456]
ruflin commented 1 year ago

It looks like this change comes from https://github.com/ruflin/Elastica/pull/1765 I think the goal was to cleanup the code but it looks like it had an unexpected side effect. @thePanz As you did the previous PR, any thoughts?

thePanz commented 1 year ago

@ruflin damn, looks like a case we did not properly check (see also: https://github.com/ruflin/Elastica/pull/1872)

Should we:

  1. just use the array_values?
  2. throw an exception? (and use the https://www.php.net/manual/en/function.array-is-list.php to make sure the list is correct?)

Any inputs?

ruflin commented 1 year ago

What speaks for 2, is that there is just one way of doing things which should simplify code maintenance. What speaks for 1 is that it removes the breaking change but it means we have to keep it like that for a while.

One option we could do, reintroduce 1 for 7.x, deprecate it and do 2 in 8.x?

@pandafox Would be great to your input on this one. Would it be tricky to change your code to adapt for 2?

thePanz commented 1 year ago

@ruflin I agree that "there is just one way of doing [this]". As the current implementation is buggy when not using a list, I would expect projects to already be using the right parameters here (maybe ->setTerms(array_values($termValues)) (this would be my guessing).

Not sure if throwing an exception would be a BC break, but I vote for this option as being the cleaner and future-proof one. WDYT?

pandafox commented 1 year ago

I think throwing exception would prob make sense for future release, the underlying incorrect usage be identified easier (and hence for them to use it correctly)

It'll be great if 7.x can be made backward compatible. On our usage, we have already patched our code to return pure array with array_values, and filed a bug report https://phabricator.wikimedia.org/T325792

Though it doesn't seem to receive much attention atm so its not yet properly patched.

ruflin commented 1 year ago

I'm ok with the proposal from @pandafox to keep it backward compatible in 7.x and make the "breaking change" in 8.x with the exception like this our 8.x code is clean. @pandafox Could you open a 2 PR's?