SeldonIO / seldon-core

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
https://www.seldon.io/tech/products/core/
Other
4.35k stars 831 forks source link

Triton inference server metrics is not supported #5279

Open antonaleks opened 8 months ago

antonaleks commented 8 months ago

Describe the bug

I can not expose triton metrics in deployment - i put ports dsecribtion at Pod.v1 spec and use Triton implementation, but metrics ports can not be recognized.

Triton server has metrics only on /metrics endpoint, not on /prometheus. May be i can change MLSERVER_METRICS_ENDPOINT env?

To reproduce

  1. define model
    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
    name: multi
    namespace: seldon-triton
    spec:
    predictors:
    - componentSpecs:
    - spec:
        containers:
          - name: multi
            image: nvcr.io/nvidia/tritonserver:23.10-py3
            args:
              - /opt/tritonserver/bin/tritonserver
              - '--grpc-port=9500'
              - '--http-port=9000'
              - '--metrics-port=8002'
              - '--model-repository=/mnt/models'
            ports:
            - name: grpc
              containerPort: 9500
              protocol: TCP
            - name: http
              containerPort: 9000
              protocol: TCP
            - name: metrics
              containerPort: 8002
              protocol: TCP
    graph:
      implementation: TRITON_SERVER
      logger:
        mode: all
      modelUri: gs://seldon-models/triton/multi
      name: multi
      type: MODEL
    name: default
    replicas: 1
    protocol: v2
  2. apply it
  3. check seldon core crd
    
    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
    annotations:
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"machinelearning.seldon.io/v1","kind":"SeldonDeployment","metadata":{"annotations":{},"name":"multi","namespace":"seldon-triton"},"spec":{"predictors":[{"componentSpecs":[{"spec":{"containers":[{"args":["/opt/tritonserver/bin/tritonserver","--grpc-port=9500","--http-port=9000","--metrics-port=8002","--model-repository=/mnt/models"],"image":"nvcr.io/nvidia/tritonserver:23.10-py3","name":"multi","ports":[{"containerPort":9500,"name":"grpc","protocol":"TCP"},{"containerPort":9000,"name":"http","protocol":"TCP"},{"containerPort":8002,"name":"metrics","protocol":"TCP"}]}]}}],"graph":{"implementation":"TRITON_SERVER","logger":{"mode":"all"},"modelUri":"gs://seldon-models/triton/multi","name":"multi","type":"MODEL"},"name":"default","replicas":1}],"protocol":"v2"}}
    creationTimestamp: '2024-02-01T13:12:54Z'
    generation: 2
    managedFields:
    - apiVersion: machinelearning.seldon.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:kubectl.kubernetes.io/last-applied-configuration: {}
        f:spec:
          .: {}
          f:predictors: {}
          f:protocol: {}
      manager: kubectl-client-side-apply
      operation: Update
      time: '2024-02-01T13:19:41Z'
    - apiVersion: machinelearning.seldon.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          .: {}
          f:address:
            .: {}
            f:url: {}
          f:conditions: {}
          f:deploymentStatus:
            .: {}
            f:multi-default-0-multi:
              .: {}
              f:availableReplicas: {}
              f:replicas: {}
          f:replicas: {}
          f:serviceStatus:
            .: {}
            f:multi-default:
              .: {}
              f:grpcEndpoint: {}
              f:httpEndpoint: {}
              f:svcName: {}
            f:multi-default-multi:
              .: {}
              f:grpcEndpoint: {}
              f:httpEndpoint: {}
              f:svcName: {}
          f:state: {}
      manager: manager
      operation: Update
      subresource: status
      time: '2024-02-01T13:19:41Z'
    name: multi
    namespace: seldon-triton
    resourceVersion: '366300'
    uid: bb7eb90f-82d3-44aa-ba56-9c720382aa6d
    selfLink: >-
    /apis/machinelearning.seldon.io/v1/namespaces/seldon-triton/seldondeployments/multi
    status:
    address:
    url: >-
      http://multi-default.seldon-triton.svc.cluster.local:8000/v2/models/multi/infer
    conditions:
    - lastTransitionTime: '2024-02-01T13:13:20Z'
      reason: No Ambassador Mappaings defined
      status: 'True'
      type: AmbassadorMappingsReady
    - lastTransitionTime: '2024-02-01T13:13:20Z'
      message: Deployment has minimum availability.
      reason: MinimumReplicasAvailable
      status: 'True'
      type: DeploymentsReady
    - lastTransitionTime: '2024-02-01T13:12:54Z'
      reason: No HPAs defined
      status: 'True'
      type: HpasReady
    - lastTransitionTime: '2024-02-01T13:12:54Z'
      reason: No KEDA resources defined
      status: 'True'
      type: KedaReady
    - lastTransitionTime: '2024-02-01T13:12:54Z'
      reason: No PDBs defined
      status: 'True'
      type: PdbsReady
    - lastTransitionTime: '2024-02-01T13:19:41Z'
      reason: Not all services created
      status: 'False'
      type: Ready
    - lastTransitionTime: '2024-02-01T13:19:41Z'
      reason: Not all services created
      status: 'False'
      type: ServicesReady
    - lastTransitionTime: '2024-02-01T13:13:20Z'
      reason: All VirtualServices created
      status: 'True'
      type: istioVirtualServicesReady
    deploymentStatus:
    multi-default-0-multi:
      availableReplicas: 1
      replicas: 2
    replicas: 2
    serviceStatus:
    multi-default:
      grpcEndpoint: multi-default.seldon-triton:5001
      httpEndpoint: multi-default.seldon-triton:8000
      svcName: multi-default
    multi-default-multi:
      grpcEndpoint: multi-default-multi.seldon-triton:9500
      httpEndpoint: multi-default-multi.seldon-triton:9000
      svcName: multi-default-multi
    state: Creating
    spec:
    predictors:
    - componentSpecs:
        - spec:
            containers:
              - args:
                  - /opt/tritonserver/bin/tritonserver
                  - '--grpc-port=9500'
                  - '--http-port=9000'
                  - '--metrics-port=8002'
                  - '--model-repository=/mnt/models'
                image: nvcr.io/nvidia/tritonserver:23.10-py3
                name: multi
                ports:
                  - containerPort: 9500
                    name: grpc
                    protocol: TCP
                  - containerPort: 9000
                    name: http
                    protocol: TCP
                  - containerPort: 8002
                    name: metrics
                    protocol: TCP
      graph:
        implementation: TRITON_SERVER
        logger:
          mode: all
        modelUri: gs://seldon-models/triton/multi
        name: multi
        type: MODEL
      name: default
      replicas: 1
    protocol: v2

5. check seldon core deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: multi-default-0-multi
  namespace: seldon-triton
  uid: e4db224a-8ad9-4b74-a0b4-7f1fa94a1f38
  resourceVersion: '364092'
  generation: 1
  creationTimestamp: '2024-02-01T13:12:54Z'
  labels:
    app: multi-default-0-multi
    app.kubernetes.io/managed-by: seldon-core
    fluentd: 'true'
    seldon-app: multi-default
    seldon-app-svc-multi: multi-default-multi
    seldon-deployment-contains-svcorch: 'true'
    seldon-deployment-id: multi
    seldon.io/model: 'true'
    version: default
  annotations:
    deployment.kubernetes.io/revision: '1'
    seldon.io/last-applied: >-
      UEsDBBQACAAIAAAAAAAAAAAAAAAAAAAAAAAIAAAAb3JpZ2luYWzcOW1zokqXf2WLqv2mE8AxM1h1P0QDBkpRERrluVMWNG1sbV4WGhRv5b9vdYNGTTKTubV3d+v5pqfPW5/3PvwlRIj6oU99ofeXQPwAkZz98tNU6AlRQShuh2jtF4S2xTb/L7TY6ZddEaAsRhTlX3ByF/mx/4zCdlAJPSFHJEziNkwyJLSENSlQTEOhJ9CsYIDm+B0RV4ftvISNxFtNTno0yCFKSVJFKKZtmMTUx3HOiJMMbt5IvcDF4Ynx+ZhfJQkReaUrUZbjJBZ6wknJl5YQ+xH6iX3YcZ76EL0ag2aYJrHQEpJ9jDILrVGGYohyofcvZm0MzmIiH25wjAjysxjHz19eNSsloSUEJIG7CWPyiAiinIap2hLY3bOEEJSdIDscszvOOYfH882F6wsILaHgtgiCbyhQxHX7uxx22l+/+n478Lv3bQV+k8XOd9n370Ph5cdLS8hTBFmYZCglGPq50JO4CRGkScYOIp/CzegcTv+kyy/c+MJUo5lP0XPF1UsIwfGzk4Y+RbVaByf2Sx8TPyDMAJL4n8yftErZP+sKn8FRlJIT7UWi+HGcUJ8Zn18vzZII0Q0qeC6kPmVxd/cKFVo3KDnMfC6RB9lL698g8/52Nl2EU5O9KGvSIntmP4S7JKV3dQLlKCtRdhfg+AogtIR2+zlLYTtNMvqH0hVFDtpQmjYgsQFFiGYY5jX02xnK1GxnKE1yTJOs+uMuimmtfC78aAkoLrlKTd5MLfVRH9g6UFeOqduruWoBfaCuphPLZrf0ScHQuNCX1odkT7Y9/SzteMQQVasm+k1BQ2s6+FBQ9wNBnOizyOPJozpaOZZ+gXxpw58opz9e0DSJ/KEE82Gs/gz9DfPxw/CSIC5hxuIyLnGI/asg6smdL5LYTqvOOwwn1rWer83gHczRQ18dzS+w//rzFPl/Cr0/T8R/Ci+X9HN19DgxV4/qdDRZjlXzV5Zp8NWFOnCYVNV86I/US5KmunxonLFqW/pg/lFk3P8iqk7kqvk4nejmJell8XvXmQ3pLyS+wf+VqB8tAUf+8+c93eBPC0KmCcGQVVF9bSZ0mqG87pUErxGsIEF1qUdzmqTsJzqcqlYU+azR/kvghSnfsJIChVbzlyCU/ockCj9eWK0juEQxyvNplgSc49rHpMiQvclQvklIKPQ6LQHHmGKfPCLiV3MEkzjMhd692BJSlOEkPIO6LSEvIER5fkEvtQQK03kCd4hynZOMCj1WI1hLwxFKCnrmIL28mQYYfl2BzwV5WrPoiuIZm5XburHRBCasxtuDKXfeLZV4QcUq8jtUP1pChvwQ/03LyP+UZTKUJ0XG57S/+DyQRTjmjX+M8pwFTtPsQ1TeXZy2SfIsvIt/CjINE94QE1JEaJwUcWPyiP08cUUU3qVJiON18jq0Nb33BGcWvyK6qLo3k2qaJSXOa/Ug10qo7T6JSVUPjWy8e+287XYeIuavU2i026+zbevNbNtupxkKMR8CW8LraFE34hW3dkv4fmq6LIJOwK4oSg2HJjJaQilzCM38OG/QMpTTE1qT9Cs+b7Vu5q12u071FR/tWkIdrG3ml9U+yXaroFivUbbK8ZEdS+JJKY6QYYpWTTysIsZPPp2vC0JWG+QTulnBDYK7/I+1T3L0zpSgmkPdVFfnvnBRtFBlFEtXIvo2wd4QRMsFyMOBnutRWC0X/UTfpt/0qLsJXAdPsEFD9yD6cz3XY6sLhxxm2qKl2ZyGUDg8kMAFRTjURN9VCsYXaIYBVGWykEzNIZ7mYIYLCu+pX/puV+QyYpPA2EuXMlj7Q+UYDvR7PTqUS1nLm//5JQ4cKhXDmdgP+/Fgj/0nS4Rav4TxDE+2u/348YHfAQ7NWdAxRH2bdE37Yb+e73HgKjvPPQAY7Zj+MTwm5ahjkmBolQGWaCBbJHgal+GTkYZDpRjJkhQ8WamO9zgYKrEng6q2i1R6tQ02wXCP1wsx1yOzDBb9MohAEWrm3nPNo75NA1QZ1FtYG2+oicu5fo8qYwsjsAmHpAyIlQYuOIZDjcKBfh/EIA+eRHbfveeO8STO8VJWmE3TIAIVrPR7t5PjINKoN2d2etXPd6WNJ3OdijAyq1FEypHclX3XSpdzRYQREQO5e/QWhuwt9PvxdlyM7QcKn3ZHRr9cGDHnX+ll0OmLo45V+QurDGrbE4gV7Ltfz7aBMqjCCFSMdjSXYhj3t6MF94MysZ39mPkNi5T7ZyDtmS+mNvcPHg10OnIlwngtO2PKffoodsaPD/uGX+2LgVR5jKdMxKBjdKdzhQbxrGz8eNQJsxWTaZ1s/RrTHWO/rPTaL7G18d0ugeQcJxKPk7hfBR2rXMpKzuNVNWd6LOaXfGr9P+Sz/5iPFI4GBgniJZ4QeplLM2cH1LljOnPgaQsJTObAWjsimIMdebSAMrMlw2F2CCMtDzmN3uW2udHN0QzNUskjUIlrAQXYO+IsxI0DtP7f4mc7B9sChmsBYz3XLMfRPk/7vi7h3FHNX+rC8vNnuliSMZtJf4/WdhTVcg5rAAzjhrYM3K44klkugxxWhjJ6usgt0p9bjmXMJMsAO7AGTtcAmmJYvB55m+AJcFkBq5vDHV7PD5c+Nm2tqXlEqeukpkxmjqRd63Cqq5+TbTvakzW/kR97W4i7aVApRRiRne+qb3J0stWPo626Hy363fHH95w65O39XnvDT+nGM8fQbG2MRwNDXroHiePEFOqsdsRmGshdqG9TqEcW8SKN1S2ox+Kt3WzLOai22F1bKpjZquLZDpj8ht1f6cFGm0nAsSVjbTnd/mx30G55hE+G5H3O9qYFrPncMe2mlxkzEawdVZmDG57mlvWl3+Qpgoml9adzp3sb3yV8MsrABaI/BNKbGN1J4zq3wHwhShrrtzPJ/LxeH9BbTld1VMWwyezan1jZw0hh/SzxFuDY1LjKW5hluDC23oL1LKowmkA+SIELzKADivBpfFsDj5572AXyV1b7d77bjQJeq6UydLuiw2aIAZdHwuG4fMW50b/pf6MFq79eCmXC+ngeyGbdX9m8ERtk6VrTIDp09W0qwhiQ0cCgXC+tvwmHz+/WggWbF4ZE9li975gzGCnYq/u3uOz07UA2M28x4z36aiZh/Tsihb+w0qV7UD33sEHAJKyPe8wO2+V+NDBE35VI0AGiI4NtIHd3rPeO53sMh6DyXWV3CTcZvAO2SxkcoWQlMAJHf6jk3qCm8SItDZ5A5YGbs454uj+bH45Q6leBbJB67rK2UDPLpUwJ0x1Vxv40L7H6zuYl3+2m4ZBsAtUiwVDrXum6ZX2Pzy9luLBszzXLIGJ9ODmMBsbeWxhpIF/BpdHAOIauufUW5hEMNyxukkA+7BqaaOmSPFwY5ObsuK59EXmu2V3Kh0b/fuUBUww6D9wH3mJDllU9XwWyRJdud8dmAr2ZW6D8zOcOPhOcYQfiLR6ex/YDXiwkhckJh6AKXFIsFxarWSbTdemG5CImd95iWV7jfaVsPuS5+yG9kvuu2WW5aEUk9+b13MzPnkAeaP0yGJIt4nlJIjtSROfJIFAGBau9TD8WkzACe4a3ZLnGfMDmaZkUHmC4igQjkzD/IOb76LBZRiyeuV1kb2EcT7P5VV1nfonMKYzMxOn0azt2JOVyEWKpM0ed26vRZDhUrdWjqj04I/u9dQh7afXu7k4L2yBLdii7WoyE7PnLVyP1++38g++N2+iAYEGTrCd9kb59kT63IfnMToNpNmze3c1TlRE2Cweh953vCXK4QfzKT7Y9FV7+Bx78N8/6e/Hl9il9Xli0UfyMY/SzHcj3T2wz3u5ArqiaFfSnCNm7+Fcrl88vT95xAaOs/k98cLlbEQiOMOW/YFoIPaEripHQEiIUJRmLuK4kj7HAqf6rQPknUF9aQo5gkWFaDZKYogO/tk9Isp9muMQEPSM1hz6pFyE9/opvCVkRP+ROjjKh9/379+//X5c+P5jXwzg/sx6QIqco03DGNyTMY4P3P6c85727U8bXe6JmOXp32vLcfPw4FY5zucggSWLUzmmS+c+o3UQHPqLfKRpXm6n6+8sFI+EX8SFdelwafhAa0k1oSKLYhMb/tlN/aylXezdDOfUzepbzQPZ+xchZeoYFQZlZ8znV+jNceDf0fxrZw8yHaHqdzSzB61vVFwqTfbz3s/BhqjN+jdhxEiKh95UhY4qiGnWNEQkttK6/aF583ebfsPlpY5nTZ9Uvl99UX1qn+nQF/fGmcl8uQVGU0uoRZ/Wq9hM2rj8X+7Tg292X/w4AAP//UEsHCKNXfNGCDAAADCEAAFBLAQIUABQACAAIAAAAAACjV3zRggwAAAwhAAAIAAAAAAAAAAAAAAAAAAAAAABvcmlnaW5hbFBLBQYAAAAAAQABADYAAAC4DAAAAAA=
  ownerReferences:
    - apiVersion: machinelearning.seldon.io/v1
      kind: SeldonDeployment
      name: multi
      uid: bb7eb90f-82d3-44aa-ba56-9c720382aa6d
      controller: true
      blockOwnerDeletion: true
  selfLink: /apis/apps/v1/namespaces/seldon-triton/deployments/multi-default-0-multi
status:
  observedGeneration: 1
  replicas: 1
  updatedReplicas: 1
  unavailableReplicas: 1
  conditions:
    - type: Available
      status: 'False'
      lastUpdateTime: '2024-02-01T13:12:54Z'
      lastTransitionTime: '2024-02-01T13:12:54Z'
      reason: MinimumReplicasUnavailable
      message: Deployment does not have minimum availability.
    - type: Progressing
      status: 'True'
      lastUpdateTime: '2024-02-01T13:12:54Z'
      lastTransitionTime: '2024-02-01T13:12:54Z'
      reason: ReplicaSetUpdated
      message: ReplicaSet "multi-default-0-multi-5b68b6494d" is progressing.
spec:
  replicas: 1
  selector:
    matchLabels:
      seldon-app: multi-default
      seldon-app-svc-multi: multi-default-multi
      seldon-deployment-id: multi
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: multi-default-0-multi
        app.kubernetes.io/managed-by: seldon-core
        fluentd: 'true'
        seldon-app: multi-default
        seldon-app-svc-multi: multi-default-multi
        seldon-deployment-id: multi
        seldon.io/model: 'true'
        version: default
      annotations:
        prometheus.io/path: /prometheus
        prometheus.io/scrape: 'true'
    spec:
      volumes:
        - name: seldon-podinfo
          downwardAPI:
            items:
              - path: annotations
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.annotations
            defaultMode: 420
        - name: multi-provision-location
          emptyDir: {}
      initContainers:
        - name: multi-model-initializer
          image: seldonio/rclone-storage-initializer:1.17.1
          args:
            - gs://seldon-models/triton/multi
            - /mnt/models
          resources:
            limits:
              cpu: '1'
              memory: 1Gi
            requests:
              cpu: 100m
              memory: 100Mi
          volumeMounts:
            - name: multi-provision-location
              mountPath: /mnt/models
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      containers:
        - name: multi
          image: nvcr.io/nvidia/tritonserver:23.10-py3
          args:
            - /opt/tritonserver/bin/tritonserver
            - '--grpc-port=9500'
            - '--http-port=9000'
            - '--metrics-port=7000'
            - '--model-repository=/mnt/models'
          ports:
            - name: grpc
              containerPort: 9500
              protocol: TCP
            - name: http
              containerPort: 9000
              protocol: TCP
          env:
            - name: PREDICTIVE_UNIT_SERVICE_PORT
              value: '9000'
            - name: PREDICTIVE_UNIT_HTTP_SERVICE_PORT
              value: '9000'
            - name: MLSERVER_HTTP_PORT
              value: '9000'
            - name: PREDICTIVE_UNIT_GRPC_SERVICE_PORT
              value: '9500'
            - name: MLSERVER_GRPC_PORT
              value: '9500'
            - name: MLSERVER_MODEL_URI
              value: /mnt/models
            - name: PREDICTIVE_UNIT_ID
              value: multi
            - name: MLSERVER_MODEL_NAME
              value: multi
            - name: PREDICTIVE_UNIT_IMAGE
              value: nvcr.io/nvidia/tritonserver:23.10-py3
            - name: PREDICTOR_ID
              value: default
            - name: PREDICTOR_LABELS
              value: '{"version":"default"}'
            - name: SELDON_DEPLOYMENT_ID
              value: multi
            - name: SELDON_EXECUTOR_ENABLED
              value: 'true'
            - name: PREDICTIVE_UNIT_METRICS_SERVICE_PORT
              value: '6000'
            - name: PREDICTIVE_UNIT_METRICS_ENDPOINT
              value: /prometheus
            - name: MLSERVER_METRICS_PORT
              value: '6000'
            - name: MLSERVER_METRICS_ENDPOINT
              value: /prometheus
          resources: {}
          volumeMounts:
            - name: seldon-podinfo
              mountPath: /etc/podinfo
            - name: multi-provision-location
              readOnly: true
              mountPath: /mnt/models
          livenessProbe:
            tcpSocket:
              port: 9000
            initialDelaySeconds: 60
            timeoutSeconds: 1
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            tcpSocket:
              port: 9000
            initialDelaySeconds: 20
            timeoutSeconds: 1
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 3
          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - '-c'
                  - /bin/sleep 10
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
        - name: seldon-container-engine
          image: docker.io/seldonio/seldon-core-executor:1.17.1
          args:
            - '--sdep'
            - multi
            - '--namespace'
            - seldon-triton
            - '--predictor'
            - default
            - '--http_port'
            - '8000'
            - '--grpc_port'
            - '5001'
            - '--protocol'
            - v2
            - '--transport'
            - rest
            - '--prometheus_path'
            - /prometheus
            - '--server_type'
            - rpc
            - '--log_work_buffer_size'
            - '10000'
            - '--log_write_timeout_ms'
            - '2000'
            - '--full_health_checks=false'
          ports:
            - name: http
              containerPort: 8000
              protocol: TCP
            - name: metrics
              containerPort: 8000
              protocol: TCP
            - name: grpc
              containerPort: 5001
              protocol: TCP
          env:
            - name: ENGINE_PREDICTOR
              value: >-
                eyJuYW1lIjoiZGVmYXVsdCIsImdyYXBoIjp7Im5hbWUiOiJtdWx0aSIsInR5cGUiOiJNT0RFTCIsImltcGxlbWVudGF0aW9uIjoiVFJJVE9OX1NFUlZFUiIsImVuZHBvaW50Ijp7InNlcnZpY2VfaG9zdCI6ImxvY2FsaG9zdCIsInNlcnZpY2VfcG9ydCI6OTAwMCwiaHR0cFBvcnQiOjkwMDAsImdycGNQb3J0Ijo5NTAwfSwibW9kZWxVcmkiOiJnczovL3NlbGRvbi1tb2RlbHMvdHJpdG9uL211bHRpIiwibG9nZ2VyIjp7Im1vZGUiOiJhbGwifX0sImNvbXBvbmVudFNwZWNzIjpbeyJtZXRhZGF0YSI6eyJjcmVhdGlvblRpbWVzdGFtcCI6bnVsbH0sInNwZWMiOnsiY29udGFpbmVycyI6W3sibmFtZSI6Im11bHRpIiwiaW1hZ2UiOiJudmNyLmlvL252aWRpYS90cml0b25zZXJ2ZXI6MjMuMTAtcHkzIiwiYXJncyI6WyIvb3B0L3RyaXRvbnNlcnZlci9iaW4vdHJpdG9uc2VydmVyIiwiLS1ncnBjLXBvcnQ9OTUwMCIsIi0taHR0cC1wb3J0PTkwMDAiLCItLW1ldHJpY3MtcG9ydD03MDAwIiwiLS1tb2RlbC1yZXBvc2l0b3J5PS9tbnQvbW9kZWxzIl0sInBvcnRzIjpbeyJuYW1lIjoiZ3JwYyIsImNvbnRhaW5lclBvcnQiOjk1MDAsInByb3RvY29sIjoiVENQIn0seyJuYW1lIjoiaHR0cCIsImNvbnRhaW5lclBvcnQiOjkwMDAsInByb3RvY29sIjoiVENQIn1dLCJlbnYiOlt7Im5hbWUiOiJQUkVESUNUSVZFX1VOSVRfU0VSVklDRV9QT1JUIiwidmFsdWUiOiI5MDAwIn0seyJuYW1lIjoiUFJFRElDVElWRV9VTklUX0hUVFBfU0VSVklDRV9QT1JUIiwidmFsdWUiOiI5MDAwIn0seyJuYW1lIjoiTUxTRVJWRVJfSFRUUF9QT1JUIiwidmFsdWUiOiI5MDAwIn0seyJuYW1lIjoiUFJFRElDVElWRV9VTklUX0dSUENfU0VSVklDRV9QT1JUIiwidmFsdWUiOiI5NTAwIn0seyJuYW1lIjoiTUxTRVJWRVJfR1JQQ19QT1JUIiwidmFsdWUiOiI5NTAwIn0seyJuYW1lIjoiTUxTRVJWRVJfTU9ERUxfVVJJIiwidmFsdWUiOiIvbW50L21vZGVscyJ9LHsibmFtZSI6IlBSRURJQ1RJVkVfVU5JVF9JRCIsInZhbHVlIjoibXVsdGkifSx7Im5hbWUiOiJNTFNFUlZFUl9NT0RFTF9OQU1FIiwidmFsdWUiOiJtdWx0aSJ9LHsibmFtZSI6IlBSRURJQ1RJVkVfVU5JVF9JTUFHRSIsInZhbHVlIjoibnZjci5pby9udmlkaWEvdHJpdG9uc2VydmVyOjIzLjEwLXB5MyJ9LHsibmFtZSI6IlBSRURJQ1RPUl9JRCIsInZhbHVlIjoiZGVmYXVsdCJ9LHsibmFtZSI6IlBSRURJQ1RPUl9MQUJFTFMiLCJ2YWx1ZSI6IntcInZlcnNpb25cIjpcImRlZmF1bHRcIn0ifSx7Im5hbWUiOiJTRUxET05fREVQTE9ZTUVOVF9JRCIsInZhbHVlIjoibXVsdGkifSx7Im5hbWUiOiJTRUxET05fRVhFQ1VUT1JfRU5BQkxFRCIsInZhbHVlIjoidHJ1ZSJ9LHsibmFtZSI6IlBSRURJQ1RJVkVfVU5JVF9NRVRSSUNTX1NFUlZJQ0VfUE9SVCIsInZhbHVlIjoiNjAwMCJ9LHsibmFtZSI6IlBSRURJQ1RJVkVfVU5JVF9NRVRSSUNTX0VORFBPSU5UIiwidmFsdWUiOiIvcHJvbWV0aGV1cyJ9LHsibmFtZSI6Ik1MU0VSVkVSX01FVFJJQ1NfUE9SVCIsInZhbHVlIjoiNjAwMCJ9LHsibmFtZSI6Ik1MU0VSVkVSX01FVFJJQ1NfRU5EUE9JTlQiLCJ2YWx1ZSI6Ii9wcm9tZXRoZXVzIn1dLCJyZXNvdXJjZXMiOnt9LCJ2b2x1bWVNb3VudHMiOlt7Im5hbWUiOiJzZWxkb24tcG9kaW5mbyIsIm1vdW50UGF0aCI6Ii9ldGMvcG9kaW5mbyJ9LHsibmFtZSI6Im11bHRpLXByb3Zpc2lvbi1sb2NhdGlvbiIsInJlYWRPbmx5Ijp0cnVlLCJtb3VudFBhdGgiOiIvbW50L21vZGVscyJ9XSwibGl2ZW5lc3NQcm9iZSI6eyJ0Y3BTb2NrZXQiOnsicG9ydCI6OTAwMH0sImluaXRpYWxEZWxheVNlY29uZHMiOjYwLCJ0aW1lb3V0U2Vjb25kcyI6MSwicGVyaW9kU2Vjb25kcyI6NSwic3VjY2Vzc1RocmVzaG9sZCI6MSwiZmFpbHVyZVRocmVzaG9sZCI6M30sInJlYWRpbmVzc1Byb2JlIjp7InRjcFNvY2tldCI6eyJwb3J0Ijo5MDAwfSwiaW5pdGlhbERlbGF5U2Vjb25kcyI6MjAsInRpbWVvdXRTZWNvbmRzIjoxLCJwZXJpb2RTZWNvbmRzIjo1LCJzdWNjZXNzVGhyZXNob2xkIjoxLCJmYWlsdXJlVGhyZXNob2xkIjozfSwibGlmZWN5Y2xlIjp7InByZVN0b3AiOnsiZXhlYyI6eyJjb21tYW5kIjpbIi9iaW4vc2giLCItYyIsIi9iaW4vc2xlZXAgMTAiXX19fSwidGVybWluYXRpb25NZXNzYWdlUGF0aCI6Ii9kZXYvdGVybWluYXRpb24tbG9nIiwidGVybWluYXRpb25NZXNzYWdlUG9saWN5IjoiRmlsZSIsImltYWdlUHVsbFBvbGljeSI6IklmTm90UHJlc2VudCJ9XX19XSwicmVwbGljYXMiOjAsImVuZ2luZVJlc291cmNlcyI6e30sImxhYmVscyI6eyJ2ZXJzaW9uIjoiZGVmYXVsdCJ9LCJzdmNPcmNoU3BlYyI6e319
            - name: REQUEST_LOGGER_DEFAULT_ENDPOINT
              value: http://default-broker
          resources:
            limits:
              cpu: 500m
              memory: 512Mi
            requests:
              cpu: 500m
              memory: 512Mi
          volumeMounts:
            - name: seldon-podinfo
              mountPath: /etc/podinfo
          livenessProbe:
            httpGet:
              path: /live
              port: 8000
              scheme: HTTP
            initialDelaySeconds: 20
            timeoutSeconds: 60
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: 8000
              scheme: HTTP
            initialDelaySeconds: 20
            timeoutSeconds: 60
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
          securityContext:
            runAsUser: 8888
            allowPrivilegeEscalation: false
      restartPolicy: Always
      terminationGracePeriodSeconds: 20
      dnsPolicy: ClusterFirst
      securityContext:
        runAsUser: 8888
      schedulerName: default-scheduler
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 25%
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600

There are no Metrics endpoint in Deploymnent! -->

Expected behaviour

Deployment has endpoint with metrics in 8002 port

Environment

antonaleks commented 8 months ago

Looks like https://github.com/SeldonIO/seldon-core/issues/5166

antonaleks commented 8 months ago

i fixed this problem by custom seldon deployment without TRITON_SERVER implementation

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: multi
  namespace: seldon-triton
spec:
  predictors:
  - annotations:
      seldon.io/no-engine: "false"
#      prometheus.io/scrape: 'true'
#      prometheus.io/path: '/metrics'
#      prometheus.io/port: '6000'
#      seldon.io/engine-metrics-prometheus-path: "/metrics"
#      seldon.io/engine-metrics-prometheus-port: "6000"
    componentSpecs:
    - spec:
        containers:
          - name: multi
            image: nvcr.io/nvidia/tritonserver:23.10-py3
            args:
              - /opt/tritonserver/bin/tritonserver
              - '--grpc-port=9500'
              - '--http-port=9000'
              - '--metrics-port=6000'
              - '--model-repository=/mnt/models'
            ports:
            - name: grpc
              containerPort: 9500
              protocol: TCP
            - name: http
              containerPort: 9000
              protocol: TCP
            - name: triton-metrics
              containerPort: 6000
              protocol: TCP
            resources:
              limits:
                nvidia.com/gpu: 1
            securityContext:
              capabilities:
                add: [ "SYS_ADMIN" ] # for DCGM
    graph:
      logger:
        mode: all
      modelUri: gs://seldon-models/triton/multi
      name: multi
    name: default
    replicas: 1
  protocol: v2

But question about TRITON_SERVER implementation is still opened

antonaleks commented 8 months ago

https://github.com/SeldonIO/seldon-core/blob/8e1d98d03f15a70808a8035c110b443c15e28a96/operator/controllers/seldondeployment_prepackaged_servers.go#L239C1-L240C1 may be really you could look at this line)