Closed sudivate closed 2 years ago
Can you give an example of how you'd expect to send an API request to your endpoint using e.g. requests
library. I think we can work from there and see how we can integrate that option with KFP client.
Right now, there are quite some technical debt in this area for supporting different auth methods. If interested, we'd also welcome discussion on how to structure this in a better way
When I try to consume REST API directly using a Bearer access token generated with grant-type client credentials it still redirects to the authorization endpoint forcing iterative login.
url=r'https://host/pipeline/apis/v1beta1/pipelines' header = {'Authorization': 'Bearer ' + token} response = requests.get(url,headers=header,verify=False)
The above code redirects to the authorization endpoint
https://login.microsoftonline.com/<tenant_id>/v2.0/authorize?client_id=<client_id> &redirect_uri=https%3A%2F%2F<host>%2Flogin%2Foidc &response_type=code &scope=profile+email+openid &state=<xxxxx>
@yanniszark what type of token can I use for REST API to skip interactive login? If that works we can have optional SSL verification on the client.
@Bobgy OIDC auth service supports only Authorization code flow which is mostly used in browser-based interactive login. In order to consume client SDK, we will have to enable client credential flow on the auth service. This will allow non-interacitve login and enable programmatic access to all API.
@sudivate I think you can decide how you want to configure Kubeflow endpoint auth for Azure. This isn't a decision KFP needs to make. Once you configured the auth as you like, we welcome contribution to let KFP sdk work with it.
@sudivate seems like you found a workaround by manually hijacking the browser cookie and passing that into the KFP client.
If you are not looking into any other changes related to this, do you want to close this issue for now?
I am also facing a similar issue for AKS following are the details
Error MaxRetryError: HTTPSConnectionPool(host='x.x.x.x', port=443): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)'),))
KF Details
I can run the test pipeline but when try to create pipeline from kfp client it gives this error.
@Junaid-Ahmed94 only Authorization Code Flow is supported in this scenario. https://openid.net/specs/openid-connect-basic-1_0.html#CodeFlow
@berndverst I am following the documented, but I believe the issue is created by the self signed certificate. A curl command showed the error much clearer.
curl -H "X-Auth-Token: <Session_Cookie>" "https://xx.xx.xx.xx/pipeline/"
. But setting -k
flag returns results as desired. I will test with a proper certificate and then will update here with the outcome
I agree with the others here. This isn't an AKS or cloud issue. The Kubeflow docs instruct you to use certmanager to create a self signed certificate. But obviously browsers and curl can't verify the identity, so you just ignore/suppress that.
The issue is that the kfp.Client
class doesn't allow you to pass verify=false
through to the underlying requests library, so you can't ignore the non-verifiable certificate. And therefore you can't use kfp.Client
on clusters that have been setup following the standard KF docs.
I was able to solve this by using a signed certificate ,not self signed but authority signed (you can use letsencrypt),
I used the above approach and it worked for me.
Hi,
I am trying to make the following code work. My goal is to get a KFP client working with Azure. When I execute that, I get the error __urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='104.42.221.31', port=443): Max retries exceeded with url: /pipeline/apis/v1beta1/healthz (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:748)'),))__ I tried following steps here https://www.kubeflow.org/docs/distributions/azure/authentication-oidc/ . When I execute that code again, same error happens. When I try to log in to Kubeflow dashboard, error Missing url parameter: code appears. What am I missing?
import argparse
import kfp
import adal
def get_access_token(tenant, clientId, client_secret):
authorityHostUrl = "https://login.microsoftonline.com"
GRAPH_RESOURCE = "00000002-0000-0000-c000-000000000000"
authority_url = authorityHostUrl + "/" + tenant
context = adal.AuthenticationContext(authority_url)
token = context.acquire_token_with_client_credentials(
GRAPH_RESOURCE, clientId, client_secret
) # noqa: E501
return token["accessToken"]
def main():
parser = argparse.ArgumentParser("run pipeline")
parser.add_argument(
"--kfp_host",
type=str,
required=True,
help="KFP endpoint",
)
parser.add_argument("--tenant", type=str, required=True, help="Tenant")
parser.add_argument(
"--service_principal", type=str, required=True, help="Service Principal"
)
parser.add_argument(
"--sp_secret", type=str, required=True, help="Service Principal Secret"
)
args = parser.parse_args()
token = get_access_token(
args.tenant, args.service_principal, args.sp_secret
)
client = kfp.Client(host=args.kfp_host, existing_token=token)
pipelines = client.list_pipelines()
print(pipelines)
if __name__ == "__main__":
main()
@pablofiumara that sounds like the issue others are talking about -- the Kubeflow Client can't verify the self signed certificate on the server. You need to replace these certs like @Junaid-Ahmed94 said.
@berndverst Thanks. It seems Kubeflow 1.3 on Azure + AAD is in progress (here's a screenshot https://ibb.co/5FmP7bF). Is that correct?
@Junaid-Ahmed94 Can you give us some details about points 1, 2 and 3, please?
Regarding point 1, I am trying this: https://github.com/mspnp/letsencrypt-pip-cert-generation/blob/main/README.md
Did you do something like that? Thanks in advance
Not sure if this is still valid to current version of KF manifest. I tried enabling access token based authentication for KFP client. Please feel free to leverage this work if it applies to your scenario.
@sudivate Thank you. I will take a look
I would like to add more information while I take a look at that. The only approach that let me use Kubeflow client with AAD was the following: https://github.com/kaizentm/kubemlops/blob/master/docs/Kubeflow-install.md#option-1-install-standalone-kubeflow-pipelines
Problem is I need a full Kubeflow installation working on Azure (not just Kubeflow pipelines)
@pablofiumara I did tried lets-encrypt but in favor to save time and get to production I asked out IT team to provide me a valid certificate, But as I mentioned in my earlier comment lets-encrypt should also achieve the same. In the end, we just want some authority signed certificate to provide TLS security for our website/URL-Address.
The site you mentioned, I took a look into that and it seems like doing what we need. But you can also use kubeflow provided kustomize manifests for letsencrypt as well. https://github.com/kubeflow/manifests/tree/v1.2-branch/cert-manager/cert-manager/overlays/letsencrypt
You just have to be sure that you are updating the things in proper places to actually make use of this. e.g. one such place is https://github.com/kubeflow/manifests/blob/v1.2-branch/stacks/azure/application/cert-manager/kustomization.yaml#L11 here you can see it is pointing to self-signed certificate you have to update this to point to letsencrypt manifests. There are few more places, but hardly 2 or 3 where the changes need to be made to make use of letsencrypt
But I will suggest you to move to kubeflow 1.3 if possible. You will get 2 major benefits straight away
@Junaid-Ahmed94 Thank you. I will try that and keep you all updated
@Junaid-Ahmed94 Your suggestion worked, thank you. A secret named letsencrypt-prod-secret and a clusterissuer named letsencrypt-prod were created. However, SSL does not work yet. I think it's because something different needs to be done compared to point 1 from here: https://www.kubeflow.org/docs/distributions/azure/authentication-oidc/#expose-kubeflow-securely-over-https
Is that correct? If so, how shoud I reference letsencrypt-prod-secret from the gateway?
Thanks in advance
@pablofiumara you should also change the cluster issuer name in your certificate.yaml file. Have you updated this too to point to the new cluster issuer ?
issuerRef:
kind: ClusterIssuer
name: kubeflow-self-signing-issuer
@Junaid-Ahmed94 Thanks for your fast response. Yes, I did. Here's my certificate:
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: istio-ingressgateway-certs
namespace: istio-system
spec:
commonName: istio-ingressgateway.istio-system.svc
ipAddresses:
- myIpAddress
isCA: true
issuerRef:
kind: ClusterIssuer
name: letsencrypt-prod
secretName: letsencrypt-prod-secret
EOF
istio-ingressgateway pod can't start. Here's the log:
Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected
I googled that error but I still can't understand how to solve all this. Deleting the pod didn't help
What should I write below tls? I don't think those two paths at the end are correct if we use Let'sEncrypt
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
# Upgrade HTTP to HTTPS
tls:
httpsRedirect: true
- hosts:
- '*'
port:
name: https
number: 443
protocol: HTTPS
tls:
mode: SIMPLE
privateKey: /etc/istio/ingressgateway-certs/tls.key
serverCertificate: /etc/istio/ingressgateway-certs/tls.crt```
@Junaid-Ahmed94 @sudivate Suppose I have a valid SSL certificate. Then I create a Kubernetes secret for it. How can I attach that to a Kubeflow dashboard on Azure?
@pablofiumara the certificate https://github.com/kubeflow/pipelines/issues/4569#issuecomment-850628491 looks fine before applying the certificate https://github.com/kubeflow/pipelines/issues/4569#issuecomment-850633203 did you removed the already existing certificate ? And are you still getting the same error https://github.com/kubeflow/pipelines/issues/4569#issuecomment-841340473 ?
@Junaid-Ahmed94 Thanks for your answer. I tried removing the already existing certificate (Kubernetes secret) but it was generated automatically again. I don't know if I am still getting the same error because first I would like to have SSL working.
Right now I am getting "This site can’t be reached" when I try to log into Kubeflow dashboard. Seems to be a problem related with AAD. It can't get into https://oneAzureIp/login/oidc?code=oneCode&session_state=oneState
This is my gateway right now. What should I do?
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"Gateway","metadata":{"annotations":{},"name":"kubeflow-gateway","namespace":"kubeflow"},"spec":{"selector":{"istio":"ingressgateway"},"servers":[{"hosts":["*"],"port":{"name":"http","number":80,"protocol":"HTTP"}}]}}
creationTimestamp: "2021-06-01T15:59:12Z"
generation: 15
name: kubeflow-gateway
namespace: kubeflow
resourceVersion: "140031"
selfLink: /apis/networking.istio.io/v1alpha3/namespaces/kubeflow/gateways/kubeflow-gateway
uid: 3fead9d1-d722-4f68-b1d0-9dcbd0b670c3
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- kubeflowDashboardIp
port:
name: https
number: 443
protocol: HTTPS
tls:
credentialName: ingress-cert
mode: SIMPLE
Where did you get this credentialName ? I am assuming you created this yourself following something like this documentation https://istio.io/latest/docs/tasks/traffic-management/ingress/secure-ingress/#configure-a-tls-ingress-gateway-for-multiple-hosts ?
Can you try the following and be sure that the steps were executed in order?
Update the gateway file, https://github.com/kubeflow/pipelines/issues/4569#issuecomment-850633203 this should be fine
Step 3 and 4 can be skipped if lets-encrypt is not used, and you have created your certificate yourself. For such certificate you have to create the secret yourself
Once you have successfully executed the above, hopefully things should work fine. Then the error you might face with KFP might be client/server certificate authentication with needs root certificate. This i have already mentioned in my previous comment. https://github.com/kubeflow/pipelines/issues/4569#issuecomment-805787819
@Junaid-Ahmed94 Thanks again.
I got credentialName from here (point 3): https://istio.io/latest/docs/tasks/traffic-management/ingress/secure-ingress/#configure-a-tls-ingress-gateway-for-a-single-host
After doing every step, this error appears after executing kubectl logs istio-ingressgateway-XXXXXXXXX -n istio-system
2021-06-02T13:18:33.129866Z warning envoy config gRPC config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) 0.0.0.0_8443: Invalid path: /etc/istio/ingressgateway-certs/tls.crt
@DavidSpek Do you know about this?
@pablofiumara Just to be clear, is this about the Istio setup and how to handle certificates for Kubeflow?
@DavidSpek Thanks for your answer. Yes, that's right. The goal is to have SSL (https, using Let's Encrypt certificate) working for Kubeflow dashboard on Azure. Kubeflow dashboard Ip address ends with dns.westus.cloudapp.azure.com
For ArgoFlow we are using the Istio Operator to install Istio, and I've re-implemented authentication to improve security. Part of that is done by replacing the OIDC Authservice with Oauth2-Proxy which is actively maintained by a large community and thus should be easier to setup with any number of providers (see here), which kind of makes Dex irrelevant (except for LDAP, which can also be done with Keycloak, I've integrated both).
Specifically for Azure we've only just started looking at it today, so I don't have the integrations for loadbalancers and Azure DNS implemented yet, but you can take a look at ArgoFlow-Azure if you're interested or want to help out.
The Istio spec we use for this auth setup is the following:
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: istio
spec:
profile: default
tag: 1.10.0 # istio/operator
hub: docker.io/istio
meshConfig:
accessLogFile: /dev/stdout
enablePrometheusMerge: true
extensionProviders:
- name: "oauth2-proxy"
envoyExtAuthzHttp:
service: "oauth2-proxy.auth.svc.cluster.local"
port: "4180" # The default port used by oauth2-proxy.
#includeHeadersInCheck: ["authorization", "cookie"] # headers sent to the oauth2-proxy in the check request.
includeHeadersInCheck: # headers sent to the oauth2-proxy in the check request.
# https://github.com/oauth2-proxy/oauth2-proxy/issues/350#issuecomment-576949334
- "cookie"
- "x-forwarded-access-token"
- "x-forwarded-user"
- "x-forwarded-email"
- "authorization"
- "x-forwarded-proto"
- "proxy-authorization"
- "user-agent"
- "x-forwarded-host"
- "from"
- "x-forwarded-for"
- "accept"
headersToUpstreamOnAllow: ["authorization", "path", "x-auth-request-user", "x-auth-request-email", "x-auth-request-access-token", "x-auth-request-user-groups"] # headers sent to backend application when request is allowed.
headersToDownstreamOnDeny: ["content-type", "set-cookie"] # headers sent back to the client when request is denied.
The AuthorizationPolicy that makes use of this is the following:
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: istio-ingressgateway
namespace: istio-system
spec:
action: CUSTOM
selector:
# Same as the istio-ingressgateway Service selector
matchLabels:
app: istio-ingressgateway
istio: ingressgateway
provider:
name: "oauth2-proxy"
rules:
- to:
- operation:
hosts:
- <<__subdomain_dashboard__>>.<<__domain__>>
- <<__subdomain_serving__>>.<<__domain__>>
The kubeflow gateway that is used is:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: kubeflow-gateway
namespace: kubeflow
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- <<__subdomain_dashboard__>>.<<__domain__>>
port:
name: http
number: 80
protocol: HTTP
# Upgrade HTTP to HTTPS
tls:
httpsRedirect: true
- hosts:
- <<__subdomain_dashboard__>>.<<__domain__>>
port:
name: https
number: 443
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: kubeflow-ingressgateway-certs
And finally the certificate for cert-manager that is used is:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: kubeflow-ingressgateway-certs
namespace: istio-system
spec:
secretName: kubeflow-ingressgateway-certs
issuerRef:
name: gateways-issuer
kind: ClusterIssuer
commonName: <<__subdomain_dashboard__>>.<<__domain__>>
dnsNames:
- <<__subdomain_dashboard__>>.<<__domain__>>
There are a few more manifests needed to get a fully working setup, but I'm not sure they are very relevant for this conversation. One important thing to note is that the setup mentioned in https://github.com/kubeflow/pipelines/issues/4569#issuecomment-850633203 should not be used, as that doesn't allow you to define a separate certificate for each Istio gateway. Another thing to be aware of is that you cannot use the same certificate secret for 2 gateways in Istio as this will result in a 404 error.
The placeholder values you see are used by the setup script in the repo, which does a find and replace using a setup.conf
file. The idea is that you for the repo, add your values in the setup.conf
file, run the script, commit and push the changes to your fork and then install everything with Argo CD which then points to your repo. This makes updates for all components easy (and automated with Renovate), allows version tracking for changes in the manifests and avoid configuration drift.
That is probably a lot more information than you needed, but hopefully it helps. If you're still having problems or if you have other questions, you can always ping me or reach out to me on Slack. If anybody is interested in helping out with ArgoFlow-Azure that is always much appreciated. The AWS version should see a first stable release soon, so the work from there most likely just needs some porting to Azure.
Thanks
@eedorenko @sudivate @jotaylo Is this possible on Azure? If so, can you send me some documentation, please? https://github.com/kubeflow/pipelines/issues/4569#issuecomment-853309063
@pablofiumara The Kubeflow gateway and certificate setup is definitely possible on Azure, that will work anywhere. The Isito installation through the operator will also work, but it will probably need some extra configs to play nicely with the Azure loadbalancer, this was also the case for AWS and shouldn't be difficult to implement (couple annotations on the gateway and another 2 yaml files). I don't have access to Azure, but if you like I can help you debug this setup over Slack as it is something I need to get working regardless. The ArgoFlow-Azure setup will potentially start being used by a fairly large entity that is running Kubeflow on Azure, so this would also need to be implemented for them.
@DavidSpek Thank you very much. Can we take a look together about how to set up SSL (using Let's Encrypt) on Azure for Kubeflow dashboard, please? If so, how can I contact you on Slack?
@pablofiumara Yeah for sure. If you are part of the Kubeflow slack you can find me as either DavidSpek or David van der Spek.
@DavidSpek Thanks. I have just sent you a message on Slack
@pablofiumara Have you successfully set up SSL (using Let's Encrypt) on Azure? I am facing the same problem.
@pwzhong Yes. I bought a domain on Azure and then installed this https://github.com/argoflow/argoflow-azure (cc @DavidSpek is the owner of that repository) Instructions https://github.com/kubeflow/kubeflow/issues/5976#issuecomment-861650631
hello guys, I am facing two challenges, i cant seem to change my OIDC provider from dex to AAD I have tried everything, and i also tried to set you TLS using lets encrypt, that didn't work. I couldn't fulfill the solver even though I have a fully functional domain name from AWS Route53 and connected it to the dns zone in azure.
/kind bug
What steps did you take and what happened: Enabled authentication with Azure AD on AKS and installing Kubeflow with
kfctl_istio_dex.v1.1.0.yaml
but skipping the dex from the manifest as Azure AD is an OIDC provider. The load balancer is exposed over https with TLS 1.3 self-signed cert.OIDC Auth Service Configuration:
Issue When using KFP client to upload the pipeline (client.pipeline_uploads.upload_pipeline()) with below client config throws an error.
client = kfp.Client(host='https://<LoadBalancer IP Address>/pipeline', existing_token=<token>)
Error HTTPSConnectionPool(host='', port=443): Max retries exceeded with url: /pipeline/apis/v1beta1/pipelines/upload?name=local_exp-6714175b-6d59-40d0-9019-5b4ee58dc483 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1076)')))
Is there a way to override cert verification?
or
When using KFP client to upload the pipeline (client.pipeline_uploads.upload_pipeline()) with below client config redirects to google auth error.
client = kfp.Client(host='https://<LoadBalancer IP Address>/pipeline ,client_id=<client_id>, other_client_id=<client_id>,other_client_secret=<application_secret>,namespace='kfauth')
Environment:
v1.1.O
kfctl_v1.1.0-0-g9a3621e_linux.tar.gz
1.0.1
3.6.8
1.0.1
Azure Kubernetes Service
1.17.11
CC: @Bobgy