When the ML extension is installed with ClusterIP set for inferenceRouterServiceType, the OnlineEndpoint seems to be building the scoring URI with the azureml-fe service's cluster IP, which is completely inaccessible outside the cluster if you use kubenet. This breaks the "Test" functionality on an Azure ML endpoint and the "az ml online-endpoint invoke" command, both of which try to use that internal ClusterIP.
$ az ml online-endpoint invoke --name <endpoint-name> \
--resource-group <workspace-rg> \
--workspace-name <workspace-name> \
--deployment-name "green" \
--request-file "sample_request.json"
cli.azure.cli.core.azclierror: (<urllib3.connection.HTTPConnection object at 0x7fc1dfebbee0>, 'Connection to 10.0.227.236 timed out. (connect timeout=300)')
Azure ML Endpoint Details
Is there not any way to set or override this URI? I dug through the documentation, but the only relevant text I could find in regards to cluster IP and nodeport configurations are here Key considerations for AzureML extension deployment:
Type NodePort. Exposes azureml-fe on each Node's IP at a staic port. You'll be able to contact azureml-fe, from outside of cluster, by requesting :. Using NodePort also allows you to setup your own load balancing solution and SSL termination for azureml-fe.
Type ClusterIP. Exposes azureml-fe on a cluster-internal IP, and it makes azureml-fe only reachable from within the cluster. For azureml-fe to serve inference requests coming outside of cluster, it requires you to setup your own load balancing solution and SSL termination for azureml-fe.
I've configured an Istio ingress controller to pass external requests along to the azureml-fe service as my own load balancing solution, and that is working fine.
The rest of the documentation is written specifically for the LoadBalancer type, ignoring these other options. I tried setting a "scoring_uri" parameter in my endpoint.yaml, but it was ignored. I also dug through the configmaps in the azureml namespace, but saw no reference of it. I assume it is pulling it from the azureml/azureml-fe service directly. Is there really no way to override this? I want it to use the IP address I've configured on the ingress which is actually accessible, not the internal cluster IP. I'm having a hard time believing the only way to expose inference endpoints externally without crippling the earlier mentioned problems hasn't already been accounted for, but I can't seem to track down a solution.
When the ML extension is installed with ClusterIP set for inferenceRouterServiceType, the OnlineEndpoint seems to be building the scoring URI with the azureml-fe service's cluster IP, which is completely inaccessible outside the cluster if you use kubenet. This breaks the "Test" functionality on an Azure ML endpoint and the "az ml online-endpoint invoke" command, both of which try to use that internal ClusterIP.
OnlineEndpoint status
Azure CLI Invoke and Error
Azure ML Endpoint Details
Is there not any way to set or override this URI? I dug through the documentation, but the only relevant text I could find in regards to cluster IP and nodeport configurations are here Key considerations for AzureML extension deployment:
I've configured an Istio ingress controller to pass external requests along to the azureml-fe service as my own load balancing solution, and that is working fine.
The rest of the documentation is written specifically for the LoadBalancer type, ignoring these other options. I tried setting a "scoring_uri" parameter in my endpoint.yaml, but it was ignored. I also dug through the configmaps in the azureml namespace, but saw no reference of it. I assume it is pulling it from the
azureml/azureml-fe
service directly. Is there really no way to override this? I want it to use the IP address I've configured on the ingress which is actually accessible, not the internal cluster IP. I'm having a hard time believing the only way to expose inference endpoints externally without crippling the earlier mentioned problems hasn't already been accounted for, but I can't seem to track down a solution.