Closed coreation closed 10 months ago
@ruflin @thePanz since you are the 2 people I've interacted with so far, I'll just post my questions here. From my point of view, this is the only elasticsearch-php
upgrade PR. I've seen some back and forth with regards to this upgrade on the discussion pages, but perhaps it's best to bundle everything here from now.
I wanted to get started, but immediately hit a first barrier. The operations with regards to indices have been rewritten. Instead of having a specific Create()
class, everything regarding indices is now added to Indices
class, where the create is refactored into a function, which then executes the operation.
As far as I see in the Elastica codebase, the aim is/was to create a Create()
instance, which was an AbstractEndpoint class, and then calling the methods on that class to execute the creation. However, this is now all abstracted away inside the new Indices::create()
function.
This makes me think that the requestEndpoint
might need to be removed, and a whole bunch of code in Index.php
reworked to use the high level Indices::XYZ()
methods. This would require us to convert the Elastica
client configuration to a elasticsearch-php
ClientInterface object.
These kind of decisions I'd much rather talk through, then just go ahead and spent time on because I'm by far no expert in the approach this library takes, but happy to learn.
@coreation Have not forgotten you on this one. I have not found yet the time to dig into the elasticsearch-php
code. Ideally, I would like to keep the basic Elastica construct and change things under the hood. Removing requestEndpoint
would be quite fundamental. I think that is where recently the discussion started, to potentially remove dependency on elasticsearch-php
which I'm not a big fan of.
Are the changes on elasticsearch-php only for indices or all over the place? I'm asking because requestEndpoint
exists in several places so I try to get my head around how big the change would be.
@coreation What if we don't convert the Elastic client to ClientInterface? Would it mean we require additional boiler plate around the client? I know you mentioned not to start coding at the same time, I feel like having a small sample PR on what the change would mean for one class (no tests passing) could help the conversation.
@ruflin I'll see what I can do in terms of an example. Going by the unittest output, most of the test didn't make it much further than initiating the index, so it might be that fixing that part of the codebase might actually by 90% of the work, but that's hopeful thinking.
I like the idea of making an example and discuss things further based on that, I'll see what I can do this week.
@coreation @ruflin if you need any help with this - just tell.
@oleg-andreyev thanks for the heads up, I think we're going to be needing some extra hands here. @ruflin @oleg-andreyev if you look at the elastic-search 8 branch endpoints compared with elastic-search 7 branch endpoints, it shows that the Endpoints have undergone a serious transformation. There could be other parts of the library that have been addressed the same way, I haven't looked into that just yet.
The transformation goes from being Endpoints being a namespace with different classes representing different operations in the 7.x branch to Endpoints becoming dedicated classes with the ElasticSearch operations embedded in those classes as functions.
Example: \ElasticSearch\Endpoints\Indices\Create
is now \ElasticSearch\Endpoints\Indices::create
With that, they also bring an ElasticSearch
class to the table, which is an HTTP response wrapper around ElasticSearch Psr7 HTTP responses.
What you'll see in my very short refactor example in the index.php file, is that we'll likely have to refactor these kind of classes (i.e. Index) on 2 parts: business logic within the methods and transforming the ElasticSearch response into an Elastica Response. I think the latter makes sense in order to make migrating to the new Elastica version doable.
@ruflin is that enough of an example? Note that I'm not looking at good code practises, just a rough example of how things need to change.
@coreation what I see is that all endpoints are autogenerated by some script that I cannot find in the repository (probably private), depending on how we wanna manage dependency with https://github.com/elastic/elasticsearch-php, relaxed or strict, if strict that means we follow their semantic (and php version) and we don't need to think about any workaround/wrappers and etc. if we want to allow to relax versions (supporting both 7.x and 8.x), then we need to create some wrapper for each endpoint and based on Client::VERSION constant decide what to do.
@ruflin what's your thoughts on this?
To keep it simple, I'd follow elasticsearch-php as is, but maybe I'm missing something.
@coreation I like the direction you are taking here. It means in many cases, users can upgrade to Elastica 8.0 without having to worry about the underlying changes (in most cases).
On the support of 7 and 8: I would keep it simple for now and only support 8. If there are many requests (and contributions) we can always introduce this later. +1 on following elasticsearch-php.
Sorry for the late reply again on my end.
Hey @ruflin , just to summarize to make sure I understand you well:
elasticsearch-php
library changeSo moving this forward, does that mean the approach based on the example I gave would be approved? If so, then we can started and hopefully get some people involved to code that up. \cc @oleg-andreyev ;) If someone is working on it, given the big amount of work ahead of us, it would be good to give each other a heads-up, so that we don't have people working on the same thing. @ruflin since you build most of this library, any recommendations for next steps / organising this?
Glad you asked back @coreation, we need to dive a bit deeper here. Lets first get PHP / Elasticsearch version out of the way. I was referring to the following:
if we want to allow to relax versions (supporting both 7.x and 8.x), then we need to create some wrapper for each endpoint and based on Client::VERSION constant decide what to do.
My assumption is @oleg-andreyev talks here about the Elasticsearch version, not PHP? At least on my end, I would focus on supporting Elasticsearch 8.x which I think is also what elasticsearch-php does. With it, we indirectly likely align also on the PHP version.
Elastica library receive the same type of response,
The same type of response we have today, not the same type of response that the elasticsearch-php library has I assume? This is indicated by line https://github.com/ruflin/Elastica/pull/2168/files#diff-858b65fbfb449bff73db68734d544fb5a4849e4bc8b986fad5896c202813641bR517 Do you forsee we can use some generic method for this or you expect it a lot of manual work?
The reason I like the above approach because one of the powers of Elastica is that you can chain calls together to build queries etc. If we move over to the elasticsearch-php response format, I think we partially loose this capability. At the same time I'm concerned it might create lots of additional code paths. It brings back the question: Would it be easier to just rip out elasticsearch-php again ...
@coreation For the split of the work, I hope we can find ways to do it in many small chunks. Have lots of small PR's that can go in in parallel. As not too many people work on the project, I'm not too worried about multiple people working at the same. What is not clear to me is if we can easily split up the work in many small PR's as the dependency update of the lib is global :-(
Hey @ruflin
Glad you asked back @coreation, we need to dive a bit deeper here. Lets first get PHP / Elasticsearch version out of the way. I was referring to the following:
if we want to allow to relax versions (supporting both 7.x and 8.x), then we need to create some wrapper for each endpoint and based on Client::VERSION constant decide what to do.
My assumption is @oleg-andreyev talks here about the Elasticsearch version, not PHP? At least on my end, I would focus on supporting Elasticsearch 8.x which I think is also what elasticsearch-php does. With it, we indirectly likely align also on the PHP version.
Clear!
Elastica library receive the same type of response, The same type of response we have today, not the same type of response that the elasticsearch-php library has I assume? This is indicated by line https://github.com/ruflin/Elastica/pull/2168/files#diff-858b65fbfb449bff73db68734d544fb5a4849e4bc8b986fad5896c202813641bR517 Do you forsee we can use some generic method for this or you expect it a lot of manual work?
Indeed, so that users of the library don't need an upgrade path to adhere to some new type of Response object. I assume the transformation function is going to be a very generic one.
The reason I like the above approach because one of the powers of Elastica is that you can chain calls together to build queries etc. If we move over to the elasticsearch-php response format, I think we partially loose this capability. At the same time I'm concerned it might create lots of additional code paths. It brings back the question: Would it be easier to just rip out elasticsearch-php again ...
This is very fundamental and I don't have that much insight into the earlier work that was done. If there's no massive amount of work needed for every elasticsearch-php
update, then I would keep it. I think this is an odd one out, since they reworked basic functionality from a more "class per ElasticSearch operation" to "class per bundle of ElasticSearch operations". Which is now affecting the Elastica library.
@coreation For the split of the work, I hope we can find ways to do it in many small chunks. Have lots of small PR's that can go in in parallel. As not too many people work on the project, I'm not too worried about multiple people working at the same. What is not clear to me is if we can easily split up the work in many small PR's as the dependency update of the lib is global :-(
I think splitting it up would require first some "basic" things to be built in place. One of them is that we make sure return types remain the same. The example I gave with the transformation of response objects is - I hope - one of the few we'll need.
Once we got that in place, it's just working in the new way of elasticsearch-php
's classes like Index::create()
instead of new Create()
.
To split things up, I think someone has to go through some classes that are affected, i.e. Index, and just make a kind of billboard ticket listing all the affected classes? That way people can contribute in small chunks in an organised manner?
I went a bit more through the code base and I think ripping out elasticsearch-php would actually be more work then doing the transition. My hope is that most of the transition will look very similar. Also I still think Elastica benefits to rely on elasticsearch-php to get all the underlying improvements that are made to it.
To split things up, I think someone has to go through some classes that are affected, i.e. Index, and just make a kind of billboard ticket listing all the affected classes? That way people can contribute in small chunks in an organised manner?
++. We can use the new tasks features in Github issues for this. The part I'm trying to get my head around: How can we do it in iterations? As soon as we upgrade the dependency everything breaks and all has to be done at once? Can we have somehow the two dependencies coexist for a while? Ideas on how we should deal with this?
@ruflin I agree, I would for sure keep the elasticsearch-php as a base and just do what this library is good at, providing a neat handy wrapper around it.
As far as tasks go, I think the iterations will likely be aspect based. Meaning, getting the Index class "green", meaning, all errors coming from not being able to create an endpoint (cfr. unittests) must no longer appear and are fixed because of the refactor work done using the new elasticsearch-php library for example.
In this case, I don't think it's doable of keeping the two dependencies in there, I think we're better off just with a clean refactor, putting a new major release version on it. I haven't done any analysis, but there's some gut feeling telling me that the majority of the work will be refactoring the deprecated classes + adding a response transformation function will be 80% of the work. But that's a very uneducated guess, more of a 🤞
@coreation One suggestion to move this forward. Lets start having a Tasklist in the description of this PR and start using this PR for all the work (kind of a feature branch). All the contributions would go with a PR against this branch. As soon as we get to a green state, we can get it into 8.x and start more iterative from there.
@ruflin good idea, I've updated my PR comment, feel free to update where applicable. Might be a good time to get @oleg-andreyev back into the conversation, time to get our hands dirty ;)
@ruflin I've updated the original post and I think apart from the approach, I think we can have a first effective discussion to move the actual code forward. My suggestion in the RFC section is that we either remove the src/Client.php or make it compatible with elasticsearch-php
since the new Endpoint classes like "Indices" need an elasticsearch-php
Client object.
My aim for my this commit is to simply have the "testCreateNewIndex" test working. That involves:
What I've found is that:
ClientConfiguration
elasticsearch-php
ClientBuilder supports a plethora of options, a lot of which I can't really backtrack to what is available in the ClientConfiguration class. Could you help out on that one @ruflin ?elasticsearch-php
Client object from the src/Client.php class. That does the trick, in other words, a basic connection can be established and the Index::create() creates the new index! Our Hello World! has seen first light.I think our first focus should be on deciding what to do with the src/Client and how we can get that to work with the Endpoint classes of the updated elasticsearch-php
classes. Once we have that, we can build up from there!
Please excuse the messy try-out code, this is really to just get something going as I discover the hoops to jump through :D.
Replace the src/Client.php usage with the elasticsearch-php Client class. RFC: @ruflin - seems like the elasticsearch-php Client pretty much does everything the src/Client.php does. I'm not sure though how vital this was for other users to be using this class naked sort of speak. If the answer is: a lot. Then I suggest we rework the src/Client so that it uses the elasticsearch-php Client under the hood. Makes for an easier phasing out later on.
I don't have data on this but I would assume, the Client object is used heavily. Every time Elastica did not support a specific call, one of the recommendations is to use the raw request method: https://github.com/ruflin/Elastica/blob/8.x/src/Client.php#L538 If possible, I would keep the Client.
@coreation I'll be out of most of July but please don't hold back on progressing with this one. @deguif @thePanz @franmomu Would be great if you could get involved in this too.
I don't have data on this but I would assume, the Client object is used heavily. Every time Elastica did not support a specific call, one of the recommendations is to use the raw request method: https://github.com/ruflin/Elastica/blob/8.x/src/Client.php#L538 If possible, I would keep the Client.
I absolutely agree, I thought this was the way to go as well, but I wanted to avoid injecting bias into the question. So the approach of getting an Elasticsearch Client based on the Elastica Client is an ok approach then I believe.
@coreation I'll be out of most of July but please don't hold back on progressing with this one. @deguif @thePanz @franmomu Would be great if you could get involved in this too.
No problem, enjoy your time OoO!
Some time ago I tried to start this and ended up having a ElasticSearch\Client
client inside of Elastica\Client
and transforming the response, I wanted to start making this work and then see how this could be improved, but something else came up and I didn't continue.
Just in case is of any help (some of the code is not tested/finished):
https://github.com/ruflin/Elastica/compare/8.x...franmomu:Elastica:elasticsearch_8?expand=1
Thanks for trying to move on this migration @coreation. How far are you in the first analysis task ?
RFC: Replace the src/Client.php usage
With Elastic 7 being EOL in few days elastic.co/support/eol, shouldn't new RFC be delayed in order to release the v8 compatibility as soon as possible ?
hey @VincentLanglet I haven't had the time to start yet, apart from what I've already listed in the todo task list. I'm currently on vacation for another 2 weeks. After that I'm ready to pick up where I left off, but work is very demanding at this time unfortunately.
@coreation @ruflin
Here is my example for delete
method:
/**
* Deletes the index.
*
* @throws ClientException
* @throws ConnectionException
* @throws ResponseException
*/
public function delete(): Response
{
$response = (new Indices($this->getClient()->getElasticsearchClient()))->delete(['index' => $this->getName()]);
return $this->covertToElasticaResponse($response);
}
what I don't like is copying: $this->getClient()->getElasticsearchClient()
also had to implement helper method:
private function covertToElasticaResponse(ElasticsearchResponse $response)
{
return new \Elastica\Response(
(string) $response->getBody(),
$response->getStatusCode(),
);
}
speaking about response, probably \Elastica\Response
wrap \Elastic\Elasticsearch\Response\Elasticsearch
and provide all B/C layers within in response.
With Elastic 7 being EOL in few days
@VincentLanglet I don't think this is the case as 9.0 is not out yet: https://www.elastic.co/support/eol
what I don't like is copying: $this->getClient()->getElasticsearchClient()
@oleg-andreyev What is your preference here? Should we fetch the client only once and then store it locally. For example having $this->getElasticsearchClient()
method that wraps the above? What is the main concern?
covertToElasticaResponse
If you compare the response object before and after, how close are we to backward compatibility? Is all the info we need in the ElasticsearchResponse
response?
With Elastic 7 being EOL in few days
@VincentLanglet I don't think this is the case as 9.0 is not out yet: elastic.co/support/eol
My bad, it's indeed the 1rst august OR when 9.0.0 is released.
With Elastic 7 being EOL in few days
@VincentLanglet I don't think this is the case as 9.0 is not out yet: elastic.co/support/eol
what I don't like is copying: $this->getClient()->getElasticsearchClient()
@oleg-andreyev What is your preference here? Should we fetch the client only once and then store it locally. For example having
$this->getElasticsearchClient()
method that wraps the above? What is the main concern?covertToElasticaResponse
If you compare the response object before and after, how close are we to backward compatibility? Is all the info we need in the
ElasticsearchResponse
response?
Speaking about client, it would be great to have single client that is using HttpInterface (so it could be Symfony Client, Guzzle and etc) and yes, it would be better just to call $this->getClient()
Speaking about response, need to compare but they are not 1:1 - 100%
Hi everyone, as I look towards EOY, I'm much too occupied with my own company in order to work on this in a decent way. There's so much that needs to be done, and I don't see any way I can make time available to work on this. I'm very sorry and hopefully someone finds the preparation done so far useful enough to continue.
@coreation Appreciate the heads up. It seems we all are a bit in the same boat. Lets see who can pick it up and drive to completion.
With https://github.com/ruflin/Elastica/pull/2181 merged, should we close this PR and continue the work on top of 8.x.
Definitely!
Tasklist
Analysis tasks
elasticsearch-php
breaking changes caused by the major upgrade 7.x to 8.x. Add the affected files to the Refactor tasks listRefactor tasks
Docker & Tests
changeme
Generics
elasticsearch-php
response to an Elastica Response object. This allows for a smooth upgrade of Elastica 7.x to 8.x for clients using the Elastica library, and not forcing them to use a new type of Response object.Specifics
elasticsearch-php - Index::create()
using the outcome of Generics - Task 1.requestEndPoint
function in src/Index.php with the newelasticsearch-php
Endpoint classes. This will require the introduction of at least a function that returns anelasticsearch-php
Client, usingClientBuilder::create()
RFC
elasticsearch-php
Client class. RFC: @ruflin - seems like theelasticsearch-php
Client pretty much does everything the src/Client.php does. I'm not sure though how vital this was for other users to be using this class naked sort of speak. If the answer is: a lot. Then I suggest we rework the src/Client so that it uses theelasticsearch-php
Client under the hood. Makes for an easier phasing out later on.