canonical / mysql-router-k8s-operator

Mysql router operator charm for kubernetes
https://charmhub.io/mysql-router-k8s
Apache License 2.0
2 stars 7 forks source link

[DPE-5637, DPE-5276] Implement expose-external config option with values false (clusterip), nodeport and loadbalancer #328

Closed shayancanonical closed 5 days ago

shayancanonical commented 1 month ago

Prerequisite

Need to merge https://github.com/canonical/data-platform-workflows/pull/244 for metallb support in CI

Issue

We have outlined the approach to expose our K8s charms externally in DA122

Summary:

Solution

Implement the contents of the spec

Testing

theoctober19th commented 2 weeks ago

Great work @shayancanonical!

I'm about to do a similar feature in kyuubi-k8s and going through this PR I had a few queries. Looking at the spec, it is my understanding that whenever the service needs to be created / deleted ("reconciled", in other words), the config-changed hook would simply return back after kubectl.apply and not wait for the service to be completely created.

When do we check whether the "reconciliation" was successful? In the PR I saw this has been checked on the config-changed hook itself, but does that mean the endpoints would not be updated until the config is changed sometime again in the future? Shouldn't we check for whether reconciliation was successful or not more frequently than that? (possibly in other hooks as well as update-status hook)

shayancanonical commented 2 weeks ago

@theoctober19th usually kubectl.apply (or lightkube.apply) can take a long time (normally, about 5-10mins) to create a service of type loadbalancer on various cloud provides as the cloud provider needs to provision certain resources. as a result, the following is the intended behavior for this PR (the code as it currently exists will likely need to be modified to accomplish this):

  1. upon config-changed we will make a request to K8s to create the service of the desired type if necessary. we will likely set the charm in MaintenanceStatus and return from the hook so other hooks can run
  2. upon a future hook, the reconcile approach of this charm allows us to check if the K8s service created in (1) is reachable. if it is reachable, we will update the endpoints in the relation databag
  3. while the K8s service is being created, we will avoid touching the databag at all. if the K8s service is being created for the first time, the endpoints in the databag will be empty. if not the first time, the endpoints will remain stale until the new K8s service is reachable (we can optimize here by making use of pebble notices for more frequent checks, or wait until the next hook - at most update-status-hook-interval)
theoctober19th commented 2 weeks ago

Thanks for the explanation, @shayancanonical.

We had discussed this in our team, and thanks to @welpaolo we had discussed some additional possibility of having a flag set to false in peer relation databag whenever a service is being created / deleted, which would then trigger peer-relation-changed event, and in that event we check for the availablity of the service and either a) reset the flag, and update the endpoints or b) defer the peer-relation-changed event hook and then basically repeat this process when the deferred hook gets fired later.

This can also be combined with checking the status of service in other event hooks (including update-status hook). This will effectively check service availablity during either peer-relation-changed or other event hooks, whichever occurs early.