Closed swoehrl-mw closed 6 months ago
Thanks @swoehrl-mw the PR LGTM.
One qq, since this PR removes the limitation of only being able to configure snapshot repositories after cluster provisioning, if a user has to configure an s3 snapshot repo, how does the operator handle this as the plugin has to be installed 1st and then create the s3 snapshot repo.
Thanks.
Adding @bbarani @salyh @jochenkressin @pchmielnik
@prudhvigodithi
One qq, since this PR removes the limitation of only being able to configure snapshot repositories after cluster provisioning, if a user has to configure an s3 snapshot repo, how does the operator handle this as the plugin has to be installed 1st and then create the s3 snapshot repo.
The operator simply retries creating the snapshot repo until it succeeds. While the cluste is still coming up and installing plugins the create request will fail and the reconciler requeues. Once everything is up the request and with it the reconciler will succeed.
Thanks @swoehrl-mw, I see there is some change in opensearch-gateway, I havent contributed to this gateway folder, may I please know what is this used for (the requests and responses)?
I see there is some change in opensearch-gateway, I havent contributed to this gateway folder, may I please know what is this used for (the requests and responses)?
The gateway is basically the client library for interacting with the opensearch api (gateway is probably not a good name, but there are bigger fish to fry). The requests and responses represent the different api requests and responses from opensearch.
Thanks @swoehrl-mw how often these responses and requests are queried, only upon the request by the snapshot reconciler (during any create/update/delete operation) ? or continuously, the reason I ask is i'm trying to reduce the operations that can cause the memory leak addressed here https://github.com/opensearch-project/opensearch-k8s-operator/issues/700. Thanks
@prudhvigodithi The API calls are done whenever the reconciler runs, so if there is a change or after (I think) 30 seconds. As the request and response objects are not stored the GC should be able to clean them up.
Description
This PR reworks how the operator provisions snapshot repositories. The beta feature used a kubernetes job with curl. With this rework the operator does the needed api calls itself using an http client. I've also extracted the logic into its own reconciler for better separation and testing. This reimplementation also removes the limitation of only being able to configure snapshot repositories after cluster provisioning which made it unusable for any kind of GitOps approach.
Issues Resolved
Partially #621
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.