canonical / kserve-operators

Charmed KServe
4 stars 2 forks source link

kserve-controller fails to be removed #131

Open DnPlas opened 1 year ago

DnPlas commented 1 year ago

Removing kserve-controller will raise an error because it expects the istio-pilot:gateway-info relation to be present. This is caused by how the resources are rendered in preparation for removal. The context for rendering these Kubernetes resources is tightly coupled to the presence of certain relations.

Traceback (most recent call last):                                                                                                                                                
  File "./src/charm.py", line 467, in <module>                                                                                                                                    
    main(KServeControllerCharm)                                                                                                                                                   
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 441, in main                                                                                  
    _emit_charm_event(charm, dispatcher.event_name)                                                                                                                               
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 149, in _emit_charm_event                                                                     
    event_to_emit.emit(*args, **kwargs)                                                                                                                                           
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 354, in emit                                                                             
    framework._emit(event)                                                                                                                                                        
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 830, in _emit                                                                            
    self._reemit(event_path)                                                                                                                                                      
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 919, in _reemit                                                                          
    custom_handler(event)                                                                                                                                                         
  File "./src/charm.py", line 284, in _on_remove                                                                                                                                  
    cm_resources_manifests = self.cm_resource_handler.render_manifests()                                                                                                          
  File "./src/charm.py", line 159, in cm_resource_handler                                                                                                                         
    context=self._inference_service_context,                                                                                                                                      
  File "./src/charm.py", line 136, in _inference_service_context                                                                                                                  
    gateways_context = self._generate_gateways_context()                                                                                                                          
  File "./src/charm.py", line 348, in _generate_gateways_context                                                                                                                  
    raise ErrorWithStatus("Please relate to istio-pilot:gateway-info", BlockedStatus)                                                                                             
charmed_kubeflow_chisme.exceptions._with_status.ErrorWithStatus: Please relate to istio-pilot:gateway-info    

Steps to reproduce

  1. Deploy juju deploy kserve-controller --channel 0.11/stable --trust or juju deploy kserve-controller --channel latest/edge --trust
  2. Deploy juju istio-pilot --channel 1.17/stable --trust and juju istio-gateway istio-ingressgateway --channel 1.17/stable --trust --config kind=ingress
  3. Deploy juju deploy knative-serving --channel 1.10/stable --trust and juju deploy knative-operator --channel 1.10/stable --trust
  4. Configure and relate juju config knative-serving namespace="knative-serving" istio.gateway.namespace=kubeflow istio.gateway.name=istio-gateway
  5. Relate all of the charms that are deployed as needed
  6. Once everything is settled, remove juju remove-application kserve-controller

Environment

microk8s 1.29-strict/stable microk8s addons: dns hostpath-storage metallb:10.64.140.43-10.64.140.49 juju 3.4/stable (3.4.4)

DnPlas commented 5 days ago

Even after the refactoring introduced in #246 and #197, this issue is still present:

unit-kserve-controller-0: 20:00:30 INFO unit.kserve-controller/0.juju-log ingress-gateway:32: Reconcile completed successfully
unit-kserve-controller-0: 20:00:30 ERROR unit.kserve-controller/0.juju-log ingress-gateway:32: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 638, in <module>
    main(KServeControllerCharm)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 342, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 839, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 928, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 432, in _on_event
    self.cm_resource_handler.apply()
  File "./src/charm.py", line 212, in cm_resource_handler
    context={**self._inference_service_context, **self.images_context},
  File "./src/charm.py", line 189, in _inference_service_context
    gateways_context = self._generate_gateways_context()
  File "./src/charm.py", line 517, in _generate_gateways_context
    ingress_gateway_info = self._ingress_gateway_info
  File "./src/charm.py", line 256, in _ingress_gateway_info
    return self._ingress_gateway_requirer.get_relation_data()
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/lib/charms/istio_pilot/v0/istio_gateway_info.py", line 206, in get_relation_data
    self._relation_preflight_checks(relation=relation)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/lib/charms/istio_pilot/v0/istio_gateway_info.py", line 183, in _relation_preflight_checks
    relation_data = relation.data[remote_app]
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/model.py", line 1480, in __getitem__
    raise KeyError(
KeyError: 'Cannot index relation data with "None". Are you trying to access remote app data during a relation-broken event? This is not allowed.'
syncronize-issues-to-jira[bot] commented 5 days ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5962.

This message was autogenerated