zilliztech / milvus-helm

Apache License 2.0
54 stars 40 forks source link

couldn't connect to milvus k8s ingress via tls #71

Open hive74 opened 6 months ago

hive74 commented 6 months ago

Hello, I'm using milvus db in k8s as standalone, have tls.crt and tls.key for my ingress dns and put it in standalone-pod via secretName: milvus-tls. CA-cert is also added to standalone-pod in /etc/ssl/certs. Certs are valid. Config milvus tls:

Python 3.10.12 protobuf 3.20.0 milvus-4.1.17 grpcio-tools 1.53.0 Milvus cli version: 0.4.2 Pymilvus version: 2.3.4


  user.yaml: |
      serverPemPath: /tmp/tls.crt
      serverKeyPath: /tmp/tls.key
        tlsMode: 1

Ingress by default:

    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/listen-ports-ssl: '[19530]'
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  - host: k8s-milvus.example.com
      - backend:
            name: my-release-milvus
              number: 19530
        path: /
        pathType: Prefix
  - hosts:
    - k8s-milvus.example.com
    secretName: milvus-tls

I get 502 in browser and Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED. trying connect via python script like connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem")

What I tried:

If disable TLS on milvus, drop ingress-line nginx.ingress.kubernetes.io/backend-protocol: GRPC and keep tls on ingress - I get 404 in browser (thats good) and CERTIFICATE_VERIFY_FAILED via script.

If connect via 80 port without milvus-tls I get Handshake failed with fatal error SSL_ERROR_SSL: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER.

Tried to mix params like server_pem_path, ca_pem_path, client_pem_path and etc.

Without milvus-tls in minikube and port-forward standalone-pod - it's well connecting. Through ingress its also dont work even with simple\milvus-default ingress. Maybe its main problem.

All pods are running without errors in logs. How can I connect to milvus db via python script? How fix ssl error? I can't disable ingress tls, but can do it on milvus db, if same (Ingress TLS, Milvus noTLS) config is possible.

haorenfsa commented 6 months ago

if you use tlsMode 1 for milvus, the annotation nginx.ingress.kubernetes.io/backend-protocol: GRPC value for ingress should be GRPCS

see https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#backend-protocol for more information.

haorenfsa commented 6 months ago

Or you can also leave nginx.ingress.kubernetes.io/backend-protocol: GRPC, and set tlsMode to 0 for milvus. In this way, the traffic from nginx to milvus will be Plain text GRPC.

hive74 commented 5 months ago

Now I try noTLS config: Values.yaml

  user.yaml: |
     serverPemPath: /milvus/configs/cert/server.pem
     serverKeyPath: /milvus/configs/cert/server.key
       tlsMode: 0


apiVersion: networking.k8s.io/v1
kind: Ingress
  name: my-release-milvus
    helm.sh/chart: milvus-4.1.17
    app.kubernetes.io/name: milvus
    app.kubernetes.io/instance: my-release
    app.kubernetes.io/version: "2.3.8"
    app.kubernetes.io/managed-by: Helm
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
      name: my-release-milvus
        number: 19530
  - host: milvus.example.com
        - path: /
          pathType: Prefix
              name: my-release-milvus
                number: 19530

I get 404 in browser and

curl http://milvus.example.com:80
404 page not found

But in VSC when running script connections.connect("default", uri="http://milvus.example.com:80", user='root', password='Milvus') I get

Traceback (most recent call last):
  File "/home/volodinyk/milvus/createuser.py", line 20, in <module>
    connections.connect("default", uri="http://milvus.example.com:80", user='root', password='Milvus')
  File "/home/volodinyk/.local/lib/python3.10/site-packages/pymilvus/orm/connections.py", line 356, in connect
    connect_milvus(**kwargs, user=user, password=password, token=token, db_name=db_name)
  File "/home/volodinyk/.local/lib/python3.10/site-packages/pymilvus/orm/connections.py", line 302, in connect_milvus
  File "/home/volodinyk/.local/lib/python3.10/site-packages/pymilvus/client/grpc_handler.py", line 136, in _wait_for_channel_ready
    raise MilvusException(
pymilvus.exceptions.MilvusException: <MilvusException: (code=2, message=Fail connecting to server on milvus.example.com:80. Timeout)>

Only way when script works, I make port-forward from service and connection is well kubectl port-forward service/my-release-milvus -n milvus 27017:19530

haorenfsa commented 5 months ago

The annotation nginx.ingress.kubernetes.io/backend-protocol: GRPC is necessary.

hive74 commented 5 months ago

The annotation nginx.ingress.kubernetes.io/backend-protocol: GRPC is necessary.

I added it to my notls config (previous message), then I get 502 in browser and curl via script also

import time
import numpy as np
from pymilvus import (
    connections, db,
    FieldSchema, CollectionSchema, DataType,
    Collection, Role

fmt = "\n=== {:30} ===\n"
search_latency_fmt = "search latency = {:.4f}s"
num_entities, dim = 3000, 8

print(fmt.format("start connecting to Milvus"))
connections.connect("default", uri="http://milvus.example.com:80", user='root', password='Milvus')

 `(code=2, message=Fail connecting to server on milvus.example.com:80. Timeout)>`
haorenfsa commented 5 months ago

@hive74 looks like the frontend TLS is necessary for nginx ingress to proxy GRPC.

Update your ingress like below:

    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
  - host: k8s-milvus.example.com
      - backend:
            name: my-release-milvus
              number: 19530
        path: /
        pathType: ImplementationSpecific
  - hosts:
    - k8s-milvus.example.com
    secretName: milvus-tls

Yet, keep the backend milvus tlsMode=0

Then connect with connections.connect("default", uri="https://k8s-milvus.example.com:443", user='root', password='Milvus', secure=True)

hive74 commented 5 months ago

Thanks for reply, but i again get 502 in browser and via script Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED <MilvusException: (code=2, message=Fail connecting to server on milvus.example.com:443. Timeout)>

What I did in local minikube:

I did it also in full k8s-cluster with existed valid certs and result is same (502 in browser, cert error via script)

Where is my fault? Its ok to get 502 in browser?

haorenfsa commented 5 months ago

em, I'm gonna need more information to be sure what's going on. Could you try use command curl -v https://milvus.example.com:443 and paste the output here?

haorenfsa commented 5 months ago

Oh, I know what's going on.

You're using the private certificate signed by yourself. So you need to add the server.pem when connecting.

Try this: connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/cert/server.pem")

hive74 commented 5 months ago

em, I'm gonna need more information to be sure what's going on. Could you try use command curl -v https://milvus.example.com:443 and paste the output here?

different tries https://pastebin.com/XH8ab2fX

Oh, I know what's going on.

You're using the private certificate signed by yourself. So you need to add the server.pem when connecting.

Try this: connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/cert/server.pem")

connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/cert/server.pem")

Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED. 
(code=2, message=Fail connecting to server on milvus.example.com:443. Timeout)>

Want to note, that ingress tls certs are generated by gen.sh and i try connect with them *.pem. But in user.yaml using defaults certs generating by milvus before start and tlsMode: 0, can it affect? may be need any certs?

  user.yaml: |
     serverPemPath: /milvus/configs/cert/server.pem
     serverKeyPath: /milvus/configs/cert/server.key
       tlsMode: 0
hive74 commented 5 months ago

curl -v -k https://milvus.example.com:443 getting 502 https://pastebin.com/spaL9X1g

haorenfsa commented 5 months ago

Want to note, that ingress tls certs are generated by gen.sh and i try connect with them *.pem. But in user.yaml using defaults certs generating by milvus before start and tlsMode: 0, can it affect? may be need any certs?

You don't need to enable tls on milvus. tls terminates at nginx ingress, nginx communicate with backend milvus using plaintext GRPC . So no worries about this.

haorenfsa commented 5 months ago

alter a bit:

connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem", server_name="milvus.example.com")

haorenfsa commented 5 months ago

@XuanYang-cn These configurations are indeed very confusing.... For now server_pem_path is not working without server_name. This seems to be a bug.

haorenfsa commented 5 months ago


hive74 commented 5 months ago

alter a bit:

connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem", server_name="milvus.example.com")

run connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem", server_name="milvus.example.com")

get Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.

i get these certs after

chmod +x gen.sh
/milvus/cert$ ls 
ca.key  ca.pem  client.csr  client.key  client.pem  gen.sh  openssl.cnf  server.csr  server.key  server.pem

i keep server.pem and server.key as tls-secret and put it to ingress secret: kubectl create secret tls milvus-tls -n milvus --key="/home/testuser/milvus/cert/server.key" --cert="/home/testuser/milvus/cert/server.pem"

  - hosts:
    - milvus.example.com
    secretName: milvus-tls
hive74 commented 5 months ago

I GOT IT!! Trying on k8s-cluster (non minikube) with CA-cert which I have in my ubuntu-storage etc/ssl/certs/ca-certificates.crt connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/etc/ssl/certs/ca-certificates.crt", server_name="k8s-milvus.example.com") Connection it done!

Problem was in missing of server_name="k8s-milvus.example.com"

In minikube will research it, maybe trouble with CA, minikube config certs is upper

Thank you so much @haorenfsa

haorenfsa commented 5 months ago

Oh, good catch! The server_name thing is indeed a bug to me. We'll fix it soon. Happy hacking with Milvus!

haorenfsa commented 5 months ago

So extra guide should be added using minikube. Most people don't have a real k8s-cluster to play with.

hive74 commented 5 months ago

Fixed in minikube by this task minikube waited Kubernetes Ingress Controller Fake Certificate, needed to custom it

haorenfsa commented 5 months ago

fixed in https://github.com/milvus-io/pymilvus/issues/1962

indyvanmol commented 5 months ago

I’m getting another error when trying to connect to Milvus when using nginx ingress on Minikube to handle the TLS. I’m trying to create a (http/grpc) proxy from nginx to the Milvus service. I have a valid TLS certificate to test as a secret: cert.

helm install my-release milvus/milvus -f values.yml


  enabled: false

    mountPath: "/var/lib/milvus"
    enabled: true
      accessModes: ReadWriteOnce
      size: 10Gi
      subPath: ""

  enabled: true

  enabled: true
  name: etcd
  replicaCount: 1
    create: false

    type: ClusterIP
    port: 2379
    peerPort: 2380

  enabled: true
    # Annotation example: set nginx ingress type
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  labels: {}
    - host: "milvus.mysite.be"
      path: "/"
      pathType: "Prefix"
    - secretName: cert
        - milvus.mysite.be

  user.yaml: |+
      serverPemPath: /milvus/configs/cert/server.pem
      serverKeyPath: /milvus/configs/cert/server.key
        tlsMode: 0

    enabled: false

  enabled: true
  storageClass: standard
  accessMode: ReadWriteOnce
  size: 10Gi

  enabled: false

When I'm trying to connect via python:

   from pymilvus import connections, db, connections, utility

connections.connect("default", host="milvus.mysite.be", port="443", secure=True, server_pem_path="/path/to/root-ca.pem", server_name="milvus.mysite.be")


print(utility.list_collections(timeout=None, using='default'))

E0325 11:26:37.585249639 133238 hpack_parser.cc:993] Error parsing 'content-type' metadata: invalid value

haorenfsa commented 5 months ago

Hi @indyvanmol , please check if milvus.mysite.be is correctly resolved to the ip of nginx ingress. And try curl --http2 https://milvus.mysite.be/ to see if there's any clues from the output

indyvanmol commented 5 months ago


curl --http2 https://milvus.mysite.be

returns: 404 page not found

haorenfsa commented 5 months ago

@indyvanmol that means the ingress is not created correctly. what's the output of kubectl describe ingress?

indyvanmol commented 5 months ago

kubectl get ingress my-release-milvus -o yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
    kubernetes.io/ingress.class: nginx
    meta.helm.sh/release-name: my-release
    meta.helm.sh/release-namespace: default
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/listen-ports-ssl: '[19530]'
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  creationTimestamp: "2024-03-19T12:02:07Z"
  generation: 1
    app.kubernetes.io/instance: my-release
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: milvus
    app.kubernetes.io/version: 2.3.9
    helm.sh/chart: milvus-4.1.18
  name: my-release-milvus
  namespace: default
  resourceVersion: "346877"
  uid: dc8a3675-62c5-4a6b-a5fe-540b8b58507c
      name: my-release-milvus
        number: 19530
  - host: milvus.mysite.be
      - backend:
            name: my-release-milvus
              number: 19530
        path: /
        pathType: Prefix
  - hosts:
    - milvus.mysite.be
    secretName: cert
  loadBalancer: {}
haorenfsa commented 5 months ago
  loadBalancer: {}

No loadBalancer attached, which means your nginx-ingress-controller is not correctly setup. Please refer to this doc for the setup procedures https://milvus.io/docs/ingress.md

indyvanmol commented 5 months ago

The docs you’re showing are for an Azure setup. I have an on-premise setup and my Nginx ingress controller is working for other HTTP services where I’m proxying to.

What kind of protocol is used for Milvus? Is it HTTP, and does an HTTP proxy to Milvus work? Are the examples showed with Nginx just TCP forwarding and not HTTP? So, with the Nginx ingress examples, TLS encryption is done on at the TCP level, but the proxy is using TCP and not HTTP. By proxy i mean the channel from nginx ingress to the service.

haorenfsa commented 5 months ago

The docs you’re showing are for an Azure setup.

@indyvanmol It's a mistake to put https://milvus.io/docs/ingress.md under Azure section, it can be used anywhere, we'll update the doc soon.

I have an on-premise setup and my Nginx ingress controller is working for other HTTP services where I’m proxying to.

We would need the output of kubectl describe ingress to see the events why the loadbalancer IP is not assigned. The kubectl get command wouldn't provide the events information we need.

What kind of protocol is used for Milvus? Is it HTTP, and does an HTTP proxy to Milvus work?

Milvus uses gRPC, and gRPC is built on top of HTTP2. Any HTTP proxy supports HTTP2 would work for milvus.

Are the examples showed with Nginx just TCP forwarding and not HTTP?

Yes, in https://milvus.io/docs/azure.md. You only need to set service.type in values.yaml

  type: LoadBalancer

So, with the Nginx ingress examples, TLS encryption is done on at the TCP level, but the proxy is using TCP and not HTTP. By proxy i mean the channel from nginx ingress to the service.

No. If you use ingress, then the TLS encryption is done at nginx proxy (the HTTP layer). The channel from nginx ingress to the milvus service is in plaintext gRPC (i.e. HTTP2 over raw TCP not TLS). Usually HTTP2 is used together with TLS, but It's a special case.

If you uses service, there's no tls (unless you add some specific annotations to the service and provide tls certificates & keys). The client communicate to milvus in plaintext gRPC, the same as the nginx proxy communciate to the backend in the ingress' case.

Thank you very much for the feedbacks. I'm sry for all the toubles caused by the docs, they should be better organized. I'll see to this done.

indyvanmol commented 5 months ago

@haorenfsa thanks for helping me i hope this helps you to give you some more insight on how to help making the docs better.

kubectl describe ingress my-release-milvus

Name:             my-release-milvus
Labels:           app.kubernetes.io/instance=my-release
Namespace:        default
Ingress Class:    <none>
Default backend:  my-release-milvus:19530 (
  cert terminates milvus.mysite.be
  Host                 Path  Backends
  ----                 ----  --------
                       /   my-release-milvus:19530 (
Annotations:           kubernetes.io/ingress.class: nginx
                       meta.helm.sh/release-name: my-release
                       meta.helm.sh/release-namespace: default
                       nginx.ingress.kubernetes.io/backend-protocol: GRPC
                       nginx.ingress.kubernetes.io/listen-ports-ssl: [19530]
                       nginx.ingress.kubernetes.io/proxy-body-size: 4m
                       nginx.ingress.kubernetes.io/ssl-redirect: true
Events:                <none>
haorenfsa commented 5 months ago

@indyvanmol I'm quite sure your nginx ingress was not installed or configured correctly. Please try following this doc's instruction on installation https://milvus.io/docs/ingress.md

indyvanmol commented 5 months ago

@haorenfsa I installed Nginx as described in the documentation and it indeed works, for which I am thankful. However, I suggest that a section be added to the documentation detailing the specific configuration required for Nginx. This is because I installed Nginx using the manifest files, not via Helm, so it’s primarily about configuration. However, it’s not clear to me what specific configuration is needed. Regardless, I appreciate the help and am pleased that it works. I find it interesting to know what kind of settings are expected of a proxy globally.

haorenfsa commented 4 months ago

@indyvanmol We'll add a section about nginx later. Thank you very much for the suggestion!

haorenfsa commented 4 months ago
