opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
12 stars 18 forks source link

Refactor static method for OpenSearch client utils #377

Closed seankao-az closed 2 weeks ago

seankao-az commented 2 weeks ago

Description

  1. Refactor OpenSearch IRestHighLevelClient creation with static factory method.
  2. Remove FlintOpenSearchMetadataLog dependency on FlintClient using the static factory.

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

noCharger commented 2 weeks ago

In parent issue https://github.com/opensearch-project/opensearch-spark/issues/371 it mentioned

Flint core assumes storage for FlintMetadataLog to be OpenSearch. This meta issue aims to refactor code components and data model that are coupled with OpenSearch, and allow plugging in custom implementations using different storage.

My 2 cents:

  1. what problem is this PR attempting to address and how does it connect to the parent issue?
  2. any test to cover the new factory class?
  3. The OpenSearchClientFactory demonstrates elements of the Factory Pattern in terms of returning an interface and abstracting the creation process. However, the primary focus on configuring RestClientBuilder rather than directly creating IRestHighLevelClient objects makes it a less clear example of the Factory Pattern. For a clearer implementation of the Factory Pattern, you might consider restructuring the code to directly create instances of IRestHighLevelClient without focusing so heavily on the builder configuration. For example, moving some of the configuration logic into separate helper methods or classes, making the factory method more focused on the final product rather than its components.
seankao-az commented 2 weeks ago

what problem is this PR attempting to address and how does it connect to the parent issue?

FlintOpenSearchMetadataLog creation will be moved to FlintOpenSearchMetadataLogService. Using the static method prevents circular dependency. Otherwise FlintOpenSearchMetadataLogService needs FlintClient to createClient, and FlintClient needs FlintOpenSearchMetadataLogService to startTransaction and getMetadataLog

seankao-az commented 2 weeks ago

consider restructuring the code to directly create instances of IRestHighLevelClient without focusing so heavily on the builder configuration

I think I shouldn't have named it Factory, instead it can be something like OpenSearchClientUtils. We don't need the flexibility provided by the Factory Pattern. I don't see an issue why having the createClient method (for the most part) configure the RestClientBuilder is a bad thing? Can you elaborate on this