opendatahub-io-contrib / ai-on-openshift

AI-on-OpenShift website source code
https://ai-on-openshift.io
GNU General Public License v3.0
62 stars 27 forks source link

Minio with self signed certificates wont work OOTB with Data Science Pipelines #60

Open pgodowski opened 8 months ago

pgodowski commented 8 months ago

If one follows the OSAI Fraud Detection tutorial and decided to use the local Minio setup, following https://github.com/opendatahub-io-contrib/ai-on-openshift/blob/main/docs/tools-and-applications/minio/minio.md , then there is an issue with creating the Data Science Pipeline, due to the TLS connection issue to Object Store:

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
  name: pipelines-definition
  namespace: fraud-detection
...
spec:
  apiServer:
    stripEOF: true
    dbConfigConMaxLifetimeSec: 120
    applyTektonCustomResource: true
    deploy: true
    enableSamplePipeline: false
    autoUpdatePipelineDefaultVersion: true
    archiveLogs: false
    terminateStatus: Cancelled
    enableOauth: true
    trackArtifacts: true
    collectMetrics: true
    injectDefaultScript: true
  database:
    disableHealthCheck: false
    mariaDB:
      deploy: true
      pipelineDBName: mlpipeline
      pvcSize: 10Gi
      username: mlpipeline
  mlmd:
    deploy: false
  objectStorage:
    disableHealthCheck: false
    externalStorage:
      bucket: pipeline-artifacts
      host: minio-api-fraud-detection.apps.ocp-ai.<dns-name-redacted>
      port: ''
      s3CredentialsSecret:
        accessKey: AWS_ACCESS_KEY_ID
        secretKey: AWS_SECRET_ACCESS_KEY
        secretName: aws-connection-pipeline-artifacts
      scheme: https
  persistenceAgent:
    deploy: true
    numWorkers: 2
  scheduledWorkflow:
    cronScheduleTimezone: UTC
    deploy: true
status:
  conditions:
    - lastTransitionTime: '2024-01-03T08:04:54Z'
      message: Database connectivity successfully verified
      observedGeneration: 2
      reason: DatabaseAvailable
      status: 'True'
      type: DatabaseAvailable
    - lastTransitionTime: '2024-01-03T08:04:33Z'
      message: Could not connect to Object Store
      observedGeneration: 2
      reason: ObjectStoreAvailable
      status: 'False'
      type: ObjectStoreAvailable

and erorr reported in data-science-pipelines-operator-controller-manager in the namespace redhat-ods-applications:

2024-01-03T08:04:54Z    ERROR   Encountered x509 UnknownAuthorityError when connecting to ObjectStore.
If using an tls S3 connection with  self-signed certs, you may specify a custom CABundle to mount on the DSP API Server
via the DSPA cr under the spec.cABundle field. If you have already provided a CABundle, verify the validity of the provided CABundle.   
{"namespace": "fraud-detection", "dspa_name": "pipelines-definition", 
"error": "Get \"https://minio-api-fraud-detection.apps.ocp-ai.<dns-name-redacted>/pipeline-artifacts/?location=\":
 x509: certificate signed by unknown authority"}

Once I solve this issue myself, will contribute PR to the minio setup instructions (https://github.com/opendatahub-io-contrib/ai-on-openshift/blob/main/docs/tools-and-applications/minio/minio.md, to provide some hint where to put OCP CA bundle reference.

pgodowski commented 8 months ago

What worked (and it's a bit ugly):

oc get secret router-certs-default -n openshift-ingress -o jsonpath="{.data['tls\.crt']}" | base64 -D > ocp-router.crt
oc create configmap minio-certs --from-file=ocp-api.crt=ocp-api.crt --from-file=ocp-router.crt=ocp-router.crt

and then patch

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
  name: pipelines-definition
spec:
  apiServer:
    cABundle:              <---- HERE
      configMapKey: ocp-router.crt
      configMapName: minio-certs
...
guimou commented 8 months ago

Yeah, Pipelines, as well as other components, currently have some issues with self-signed certificates. This is known and worked upon. Another solution is not to use the Route to access Minio, but directly the Service in http mode, all the traffic being then purely internal to the cluster.

pgodowski commented 8 months ago

Thanks for your feedback. Are you saying that even if I added cABundle as in https://github.com/opendatahub-io-contrib/ai-on-openshift/issues/60#issuecomment-1875090002, Pipelines won't work anyway?

guimou commented 8 months ago

Oh, no, if you have tested it as you said and it worked, then it works. What I meant is that the solution that will finally be implemented may be this one, or a slightly different one. Iirc, the team is looking to define/upload certificates from a central point, that will then be applied to all components. So the caBundle directive, as you did, will surely be there as there are not a thousands different methods available, but it may or may not come from a configMap.