strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.78k stars 1.28k forks source link

Add ability to pass secrets into Kafka Connect build container #4603

Open adrianisk opened 3 years ago

adrianisk commented 3 years ago

Is your feature request related to a problem? Please describe. One of our Kafka Connect plugins is hosted in a private maven repository (Artifactory). Currently there's no way to pass in a secret as an environment variable, or any other way from what I can tell, to the build container that we can use when making a request to our private Maven repo.

Describe the solution you'd like Either:

  1. Add ability to set environment variables from secrets in the build container (maybe add an additional buildExternalConfiguration section to the KafkaConnectSpec?). Users could then specify plugin urls in the format

    https://username:$PASSWORD@MY_PRIVATE_REPO/some_plugin.tar.gz

    This approach relies on curl converting the credentials in the URL to an Authorization header, meaning it won't work for custom headers.

  2. Add an authHeader field to the DownloadableArtifact spec, for example:

    plugins:
      - name: debezium-postgres-connector
        artifacts:
          - type: tgz
            url: https://MY_PRIVATE_REPO/some_plugin.tar.gz
            authHeader:
                secretName: my-private-repo-auth-secret
                header: my-private-repo-header

    Users could then set the auth header secret to things like Authorization: Basic abc12345=, Authorization: token 5199831f4dd3b79e7c5b7e0ebe75d67aa66e79d4" or X-JFrog-Art-Api:ABcdEF. The authHeaders would be injected as environment variables to the build container, and the KafkaConnectDockerfile class would then add the -H $ENVIRONMENT_VARIABLE to the generated curl request where appropriate.

Describe alternatives you've considered I tried finding a way to do this with the templates section of the KafkaConnect spec, but couldn't find a way to set an environment variable based on a secret (doesn't seem like it's supported in code).

Additional context If you think the auth header feature sounds like a good approach I'd be happy to contribute, let me know and I'll send a PR.

scholzj commented 3 years ago

I think this needs to be thought true - we do not want to end up implementing N different authentications for 10 different people. So whatever we do, it needs to be something what works well for all use-cases. We also need to be able to test it in some way. Can Maven be used to pull the plugin JARs if you have it in a Maven repository? That might be something what gives us single way how to deal with it.

adrianisk commented 3 years ago

we do not want to end up implementing N different authentications for 10 different people

Yep, makes sense

it needs to be something what works well for all use-cases need to be able to test it in some way

Both of those reasons are why I thought the auth header approach would be a good fit. The current implementation for downloading plugins (using curl) means both HTTP/HTTPS and FTP/SFTP are supported. Both of those protocols allow for authentication via headers, and allowing the user to set the full header in a secret means Strimzi doesn't need to know the details, it's just passing the headers along. The responsibility for setting the header correctly falls on the user when they create the secret, so testing is pretty simple as we just need to verify the headers are being set correctly in the curl request. We do probably need to allow for a list of headers instead of just a single header though...

This approach would cover: Basic authentication (username/password): Authorization: Basic AXVubzpwQDU1dzByYM== Tokens: Authorization: Bearer <token> API Keys: X-API-KEY: abcdef12345 or X-JFrog-Art-Api:ABcdEF

meaning Artifactory/Nexus, releases on private Github repos, BitBucket, Dropbox, etc would be supported

I'm not hugely familiar with the Java ecosystem, so Maven might actually be the way to go here, but this seems like a quick/simple fix that will cover the majority of use cases for now?

adrianisk commented 3 years ago

@scholzj We've hit the need for something like this a few times now, if I sent a PR for the auth header implementation would you accept it? It's a pretty simple solution that I think would cover a lot of use cases.

scholzj commented 3 years ago

I think we need to think this through so that it is implemented properly. There is only a very little space to change the CRDs later. So I think we need to first put together a list of all the different mechanisms we might need to support and then design the API accordingly to make sure it is future proof.

scholzj commented 3 years ago

I had some thoughts on how this could be done. I think the API might look something like this:

plugins:
  - name: debezium-postgres-connector
    artifacts:
      - type: tgz
        url: https://MY_PRIVATE_REPO/some_plugin.tar.gz
        authentication:
          type: httpHeader
          header:
            valueFrom:
              secretKeyRef:
                name: authSecret
                key: authHeader

Where the value in the secret would be something like Authorization: Basic AXVubzpwQDU1dzByYM==.

Alternatively, I guess we could also split the header into two parts:

plugins:
  - name: debezium-postgres-connector
    artifacts:
      - type: tgz
        url: https://MY_PRIVATE_REPO/some_plugin.tar.gz
        authentication:
          type: httpHeader
          header: Auhtorization
          token:
            valueFrom:
              secretKeyRef:
                name: authSecret
                key: authHeader

Where the value in the secret would be basically just something like Basic AXVubzpwQDU1dzByYM== and the full header will be stitched together in the code.

This should follow the usual Kube and Strimzi API designs and be easily extensible for additional types of authentication if needed. Some things I'm not sure about is whether we might in some cases need multiple headers at the same time in which case we might need an array, e.g.:

plugins:
  - name: debezium-postgres-connector
    artifacts:
      - type: tgz
        url: https://MY_PRIVATE_REPO/some_plugin.tar.gz
        authentication:
          type: httpHeader
          headers:
            - header:
                valueFrom:
                  secretKeyRef:
                    name: authSecret
                    key: authHeader

I did not yet figured out how to implement it in the build. I guess we can mount the secrets as env vars and use them in the generated Dockerfile as build arguments. But we will need to double check that it works with both the plain Kubernetes as well as OCP build implementations.

adrianisk commented 3 years ago

Awesome! Yeah that's in line with what I was thinking as well. Also I agree that someone might need multiple headers, so an array makes sense to me. I'm happy to implement this, how about I take a look at the code sometime today or Monday and write up a quick summary of how I think I think it could be implemented so you can double check that it makes sense before I start?

scholzj commented 3 years ago

Sounds like a plan. If you would need any help feel free to let me know here or on our Slack.

scholzj commented 2 years ago

Triaged on 12.4.2022: This feature makes sense. It is not completely trivial to cover all the use-cases (different types of authentications for HTTP(S) downloads, different HTTP Headers, certificates for servers with certificates signed by non-public CAs, Maven credentials for Maven artifacts, etc.), but we should have some support for the credentials / secrets for the build.

itwolf81 commented 1 year ago

I have the same issue like discussed above, but with gitlab maven repository where I have custom kafka connector. Gitlab requires to pass access token to each request url built from this strimzi code: (link)

Gitlab accepts passing access token either by header, or by query parameter ei:

curl -f -L --create-dirs --output /tmp/a001/test.jar https://gitlab.notino.com/api/v4/projects/1376/packages/maven/com/notino/finance/kafka/connect/finance-kafka-connect-email/0.1.2/finance-kafka-connect-email-0.1.2.jar?private_token=[ACCESS_TOKEN]

So current kafka-connect configuration for maven build doesn't work:

    plugins:
      - name: kafka-email-source-connector
        artifacts:
          - type: maven
            repository: https://gitlab.notino.com/api/v4/projects/1376/packages/maven
            group: com.notino.finance.kafka.connect
            artifact: finance-kafka-connect-email
            version: 0.1.2
bogdatov commented 4 months ago

Was something like this implemented? Really need to use private repo to host my jars

scholzj commented 4 months ago

No, this was not implemented. It is still open.