Open arjun180 opened 3 years ago
You need to add the sts jar to your connector folder related to S3 in plugin path
Hi @oscerd - thank you. A clarifying question :
Essentially you need to add the STS jar (https://search.maven.org/artifact/software.amazon.awssdk/sts) in your zipped connector or in the exploded folder once it has been unzipped in the plugin path folder.
Hi @oscerd - Sorry for the delay. I added the STS jar. Now my Kafka-connect yaml looks like :
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
namespace : my-kafka
name: my-connect-cluster
annotations:
strimzi.io/use-connector-resources: "true"
spec:
template:
serviceAccount:
metadata:
annotations:
eks.amazonaws.com/role-arn:arn:aws:iam::xxxxxxxxxxxx:role/my-kafka-sa
replicas: 1
bootstrapServers: my-kafka-dev.com:9093
tls:
trustedCertificates:
- secretName: my-kafka-secret
certificate: my_server_com.crt
authentication:
type: oauth
tokenEndpointUri: <token>
clientId: <client_id>
clientSecret:
key: secret
secretName: my-oauth-secret
scope: kafka
config:
group.id: my-connect-cluster-cluster
offset.storage.topic: my-connect-cluster-offsets
config.storage.topic: my-connect-cluster-configs
status.storage.topic: my-connect-cluster-status
key.converter: org.apache.kafka.connect.json.JsonConverter
value.converter: org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable: true
value.converter.schemas.enable: true
config.storage.replication.factor: 1
offset.storage.replication.factor: 1
status.storage.replication.factor: 1
build:
output:
type: docker
image: my-kafka-connect:latest
pushSecret: <secret>
plugins:
- name: aws2-s3-kafka-connect
artifacts:
- type: tgz
url: https://repo.maven.apache.org/maven2/org/apache/camel/kafkaconnector/camel-aws2-s3-kafka-connector/0.10.1/camel-aws2-s3-kafka-connector-0.10.1-package.tar.gz
- type : jar
url : https://repo1.maven.org/maven2/software/amazon/awssdk/sts/2.17.51/sts-2.17.51.jar
After this I added the AWS s3 source connector with the same configs as above. I got a
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from any of the providers in the chain AwsCredentialsProviderChain(credentialsProviders=[SystemPropertyCredentialsProvider(), EnvironmentVariableCredentialsProvider(), WebIdentityTokenCredentialsProvider(), ProfileCredentialsProvider(), ContainerCredentialsProvider(), InstanceProfileCredentialsProvider()]) : [SystemPropertyCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., EnvironmentVariableCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., WebIdentityTokenCredentialsProvider(): To use web identity tokens, the 'sts' service module must be on the class path., ProfileCredentialsProvider(): Profile file contained no credentials for profile 'default': ProfileFile(profiles=[]), ContainerCredentialsProvider(): Cannot fetch credentials from container - neither AWS_CONTAINER_CREDENTIALS_FULL_URI or AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variables are set., InstanceProfileCredentialsProvider(): Unable to load credentials from service endpoint.]
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98)
at software.amazon.awssdk.auth.credentials.AwsCredentialsProviderChain.resolveCredentials(AwsCredentialsProviderChain.java:112)
at software.amazon.awssdk.auth.credentials.internal.LazyAwsCredentialsProvider.resolveCredentials(LazyAwsCredentialsProvider.java:45)
at software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider.resolveCredentials(DefaultCredentialsProvider.java:104)
at software.amazon.awssdk.awscore.client.handler.AwsClientHandlerUtils.createExecutionContext(AwsClientHandlerUtils.java:79)
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.createExecutionContext(AwsSyncClientHandler.java:68)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:97)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:167)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:94)
at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:55)
at software.amazon.awssdk.services.s3.DefaultS3Client.headBucket(DefaultS3Client.java:4861)
at org.apache.camel.component.aws2.s3.AWS2S3Endpoint.doStart(AWS2S3Endpoint.java:98)
at org.apache.camel.support.service.BaseService.start(BaseService.java:115)
at org.apache.camel.support.service.ServiceHelper.startService(ServiceHelper.java:113)
at org.apache.camel.impl.engine.RouteService.doWarmUp(RouteService.java:186)
at org.apache.camel.impl.engine.RouteService.warmUp(RouteService.java:121)
error again. Any pointers on this?
I don't think you can do in that way, you need to build a tar.gz connector self-containing STS jar in the same folder, so in the plugin path you'll have a directory for S3 Source connector containining all the connector JARs plus the STS jar. In the way you're doing I don't think the folder will be the same.
Thanks @oscerd - Is there a recommended way of going about merging the JAR files?
I can merge the connector tar gz file and STS JAR locally. However as far as I understand, the artifacts
section in the Kafka Connect yaml can only take in urls. And I can't push my locally merged JAR files into the maven repo.
I can exec into my running Kafka Connect pod, navigate to the connector folder and then run a curl -O https://repo1.maven.org/maven2/software/amazon/awssdk/sts/2.17.51/sts-2.17.51.jar
. However, I have permission denied
issues.
My suggestion is pushing the connector on one custom maven central groupId if there is no other way, but maybe @scholzj can suggest some approach.
By the way before going with this approach, maybe you should try to build a local connector and run kafka connect locally
I don't think you can do in that way, you need to build a tar.gz connector self-containing STS jar in the same folder, so in the plugin path you'll have a directory for S3 Source connector containining all the connector JARs plus the STS jar. In the way you're doing I don't think the folder will be the same.
My understanding (but it might be wrong) is that it should not need to be in the exact same directory. AFAIU, each connector has its own top-level directory and all JARs from this directory and all its subdirectories should be part of the same classpath. So the separate download as used in the example above should IMHO work, but I never tried this exact combination.
I can merge the connector tar gz file and STS JAR locally. However as far as I understand, the artifacts section in the Kafka Connect yaml can only take in urls. And I can't push my locally merged JAR files into the maven repo.
To download the archive with the connector, it does not need to be in a Maven Central repository. You can prepare your own archive and just put it to any website where it can be downloaded just with a link. You can also build the container image manually => if nothing else that should help to debug the issue (the placement of the JAR etc.).
I don't think you can do in that way, you need to build a tar.gz connector self-containing STS jar in the same folder, so in the plugin path you'll have a directory for S3 Source connector containining all the connector JARs plus the STS jar. In the way you're doing I don't think the folder will be the same.
My understanding (but it might be wrong) is that it should not need to be in the exact same directory. AFAIU, each connector has its own top-level directory and all JARs from this directory and all its subdirectories should be part of the same classpath. So the separate download as used in the example above should IMHO work, but I never tried this exact combination.
In the past, at least locally, I had trouble to make it work when I didn't have the needed JARs in the same exploded directory representing a single connector. That's why I was suggesting this. I'm not sure if it would be the same with Strimzi, but I think the first test I would do would be one based on a local kafka instance to check if the setup works and then eventually move the configuration to Strimzi.
I can merge the connector tar gz file and STS JAR locally. However as far as I understand, the artifacts section in the Kafka Connect yaml can only take in urls. And I can't push my locally merged JAR files into the maven repo.
To download the archive with the connector, it does not need to be in a Maven Central repository. You can prepare your own archive and just put it to any website where it can be downloaded just with a link. You can also build the container image manually => if nothing else that should help to debug the issue (the placement of the JAR etc.).
Thank you @oscerd and @scholzj for your comments. Based on those - I managed to create a AWS Kafka s3 source connector with the STS jar. I used this link to setup : https://strimzi.io/blog/2020/05/07/camel-kafka-connectors/.
The jar files I currently have :
LICENSE.txt camel-main-3.10.0.jar netty-buffer-4.1.54.Final.jar
NOTICE.txt camel-management-api-3.10.0.jar netty-codec-4.1.54.Final.jar
README.adoc camel-seda-3.10.0.jar netty-codec-http-4.1.54.Final.jar
annotations-13.0.jar camel-support-3.10.0.jar netty-codec-http2-4.1.54.Final.jar
annotations-2.16.62.jar camel-util-3.10.0.jar netty-common-4.1.54.Final.jar
apache-client-2.16.62.jar commons-codec-1.15.jar netty-handler-4.1.54.Final.jar
apicurio-registry-common-1.3.2.Final.jar commons-compress-1.20.jar netty-nio-client-2.16.62.jar
apicurio-registry-rest-client-1.3.2.Final.jar commons-logging-1.2.jar netty-reactive-streams-2.0.5.jar
apicurio-registry-utils-converter-1.3.2.Final.jar connect-json-2.6.0.jar netty-reactive-streams-http-2.0.5.jar
apicurio-registry-utils-serde-1.3.2.Final.jar converter-jackson-2.9.0.jar netty-resolver-4.1.54.Final.jar
arns-2.16.62.jar eventstream-1.0.1.jar netty-transport-4.1.54.Final.jar
auth-2.16.62.jar http-client-spi-2.16.62.jar netty-transport-native-epoll-4.1.54.Final-linux-x86_64.jar
avro-1.10.2.jar httpclient-4.5.13.jar netty-transport-native-unix-common-4.1.54.Final.jar
aws-core-2.16.62.jar httpcore-4.4.14.jar okhttp-4.8.1.jar
aws-query-protocol-2.16.62.jar jackson-annotations-2.12.3.jar okio-2.7.0.jar
aws-xml-protocol-2.16.62.jar jackson-core-2.12.3.jar profiles-2.16.62.jar
camel-api-3.10.0.jar jackson-databind-2.12.3.jar protobuf-java-3.13.0.jar
camel-aws2-s3-3.10.0.jar jackson-dataformat-avro-2.11.3.jar protocol-core-2.16.62.jar
camel-aws2-s3-kafka-connector-0.10.1.jar jackson-datatype-jdk8-2.11.3.jar reactive-streams-1.0.3.jar
camel-base-3.10.0.jar javax.annotation-api-1.3.2.jar regions-2.16.62.jar
camel-base-engine-3.10.0.jar jboss-jaxrs-api_2.1_spec-2.0.1.Final.jar retrofit-2.9.0.jar
camel-core-engine-3.10.0.jar jctools-core-3.3.0.jar s3-2.16.62.jar
camel-core-languages-3.10.0.jar kafka-clients-2.8.0.jar sdk-core-2.16.62.jar
camel-core-model-3.10.0.jar kotlin-reflect-1.3.20.jar slf4j-api-1.7.30.jar
camel-core-processor-3.10.0.jar kotlin-stdlib-1.3.20.jar snappy-java-1.1.8.1.jar
camel-core-reifier-3.10.0.jar kotlin-stdlib-common-1.3.20.jar sts-2.17.51.jar
camel-direct-3.10.0.jar lz4-java-1.7.1.jar utils-2.16.62.jar
camel-jackson-3.10.0.jar medeia-validator-core-1.1.1.jar zstd-jni-1.4.9-1.jar
camel-kafka-3.10.0.jar medeia-validator-jackson-1.1.1.jar
camel-kafka-connector-0.10.1.jar metrics-spi-2.16.62.jar
This includes the sts-2.17.51 jar as well. My Kafka connect configuration looks like this :
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
namespace : my-kafka
name: my-dev-kafka-connect-cluster
annotations:
strimzi.io/use-connector-resources: "true"
spec:
image: my-kafka-connect:latest
template:
serviceAccount:
metadata:
annotations:
eks.amazonaws.com/role-arn:arn:aws:iam::xxxxxxxxxxxx:role/my-kafka-sa
replicas: 1
bootstrapServers: my-kafka-dev.com:9093
tls:
trustedCertificates:
- secretName: my-kafka-secret
certificate: my_server_com.crt
authentication:
type: oauth
tokenEndpointUri: <token>
clientId: <client_id>
clientSecret:
key: secret
secretName: my-oauth-secret
scope: kafka
config:
group.id: my-connect-cluster-cluster
offset.storage.topic: my-connect-cluster-offsets
config.storage.topic: my-connect-cluster-configs
status.storage.topic: my-connect-cluster-status
key.converter: org.apache.kafka.connect.json.JsonConverter
value.converter: org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable: true
value.converter.schemas.enable: true
config.storage.replication.factor: 1
offset.storage.replication.factor: 1
status.storage.replication.factor: 1
Once I started up the Kafka connector - I got the same error :
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from any of the providers in the chain AwsCredentialsProviderChain(credentialsProviders=[SystemPropertyCredentialsProvider(), EnvironmentVariableCredentialsProvider(), WebIdentityTokenCredentialsProvider(), ProfileCredentialsProvider(), ContainerCredentialsProvider(), InstanceProfileCredentialsProvider()]) : [SystemPropertyCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., EnvironmentVariableCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., WebIdentityTokenCredentialsProvider(): Not authorized to perform sts:AssumeRoleWithWebIdentity (Service: Sts, Status Code: 403, Request ID: 739494d8-e385-4d3f-88b5-583aedff9252, Extended Request ID: null), ProfileCredentialsProvider(): Profile file contained no credentials for profile 'default': ProfileFile(profiles=[]), ContainerCredentialsProvider(): Cannot fetch credentials from container - neither AWS_CONTAINER_CREDENTIALS_FULL_URI or AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variables are set., InstanceProfileCredentialsProvider(): Unable to load credentials from service endpoint.]
For the useDefaultCredentialsProvider option use the camel.source.endpoint option and not the camel.component one. Also try with the latest release
By the way the error seems to suggest you cannot assume the role so the error is different. Check your settings in aws
Thanks @oscerd - I get the error before I setup the Kafka connector yaml.
For the useDefaultCredentialsProvider option use the camel.source.endpoint option and not the camel.component one. Also try with the latest release
I don't get what you mean
If you deploy the connector the error could be related only to the startup of AWS S3 client. So it's definitely a connector exception.
I think the error on the Kafka Connect pod is confusing. It says Not authorized to perform sts:AssumeRoleWithWebIdentity
- which means the AWS role being used on the Kafka Connect pod is unable to assume the role to carry out the operations.
However, we did more two more things.
Logged onto the Kafka Connect pod to check the env variables :
AWS_DEFAULT_REGION=<region>
AWS_WEB_IDENTITY_TOKEN_FILE=/eks.amazonaws.com/serviceaccount/token
AWS_REGION=<region>
AWS_ROLE_ARN=arn:aws:iam::xxxxxxxxxxxx:role/my-kafka-sa
Checked the trust relationship for the AWS role above : "Action": "sts:AssumeRoleWithWebIdentity",
I was wondering what exactly happens when a Kafka Connect pod is spun up - does it not use the AWS credentials within the pod. The error seems to suggest it does not.
@scholzj @oscerd
It is some time since I last used some connector connecting to AWS. But at that time I configured the AWS credentials directly in the connector configuration (see full example here: https://github.com/scholzj/demo-devconfcz-2021-apache-kafka-as-a-monitoring-data-pipeline/blob/master/09-s3-connector.yaml):
camel.component.aws2-s3.accessKey: ${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws.access-key}
camel.component.aws2-s3.secretKey: ${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws.secret-key}
camel.component.aws2-s3.region: ${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws.region}
Today, you would be able to do the same also through the EnvVar Configuration Provider and environment variables. I do not remember anymore if I tried to pass the options directly as the AWS environment variables without configuring them in the connector or not. So I don't know if it worked for me or not.
Thanks @oscerd and @scholzj. After some thought, we've decided to use keys instead of the service account role. The process of using service account role was too complicated and we couldn't get it work.
Couple of quick questions before I close out this issue :
We're planning on setting up a script to rotate keys in the Kafka Connect pod. During the process of rotating the keys, what happens to the Kafka Connect pod? Will it just restart?
I was planning on creating a feature request (or documentation request/example) for using service accounts instead of the default EKS role for Kafka connectors. What are your thoughts on this?
We're planning on setting up a script to rotate keys in the Kafka Connect pod. During the process of rotating the keys, what happens to the Kafka Connect pod? Will it just restart?
That really depends how you add it. But I would expect that you might need to roll it yourself. But maybe that could be something to improve in the future.
I was planning on creating a feature request (or documentation request/example) for using service accounts instead of the default EKS role for Kafka connectors. What are your thoughts on this?
Maybe you would need to elaborate a bit more on how exactly you mean it?
Maybe you would need to elaborate a bit more on how exactly you mean it?
I asked my colleague @rlonberg to elaborate on the practice of using service accounts instead of the EKS node role. His response is below :
The default EKS role (AWS instance metadata) for worker nodes is over-privileged and it's not a good security practice to have pods inherit this node role for a couple reasons.This breaks privilege isolation practices between namespaces. If we give the node role permission to write to an s3 bucket for kafka-connect, now all pods outside of the kafka-connect namespace can access this s3 bucket as well.
The EKS default node role needs a few AWS policies to operate that are dangerous for pods to inherit. The EKS node role has the ability to list/describe VPCs subnets and security groups across the entire AWS account. It can describe any EC2 instance in the AWS account (AMIs, private IPs, disks attached). It can nuke all network interfaces in the entire account == this would bring all instances in the account offline and constitute a highly effective denial of service attack vector. It can enumerate and pull any Docker image in the account ECR and also query all image vulnerabilities with ecr:DescribeImageScanFindings in the entire account.
It's a better practice to instead assign service accounts to all EKS deployments using least privilege IAM role associations designed specifically for the deployment use-case. Why would a kafka-connect CRD need to do all the above instead of only having the privileges to write to a specific s3 endpoint?
This is already the officially accepted practice from AWS: https://docs.aws.amazon.com/eks/latest/userguide/best-practices-security.html#restrict-ec2-credential-access -- who recommend that users restrict access to the instance metadata service when provisioning their EKS cluster and use service accounts instead.
Here's another writeup on the problem with using the default node role. It goes into detail on why this practice should be avoided: https://blog.christophetd.fr/privilege-escalation-in-aws-elastic-kubernetes-service-eks-by-compromising-the-instance-role-of-worker-nodes/
So, I'm not sure what needs to change so that you can use it. Connect has its own SA already. So you can just assign the IAM role to it and use it. AFAIK there are users doing that with Strimzi.
Thanks @oscerd and @scholzj. After some thought, we've decided to use keys instead of the service account role. The process of using service account role was too complicated and we couldn't get it work.
Couple of quick questions before I close out this issue :
- We're planning on setting up a script to rotate keys in the Kafka Connect pod. During the process of rotating the keys, what happens to the Kafka Connect pod? Will it just restart?
I don't think so, you'll need to redeploy it.
- I was planning on creating a feature request (or documentation request/example) for using service accounts instead of the default EKS role for Kafka connectors. What are your thoughts on this?
I think it would be a good addition to documentation. Adding this stuff to the connector code it's almost impossible, because it depends from the camel component. It could also be a guest blog post for the Apache Camel Blog
@oscerd i'm seeing similar issue. Also, i was trying to see the source code where AWS clients are initialized with respect to reading creds to better understand how the credentials are loaded. Currently my connector pod has web token file and AWS IAM role that it can use to get the tokens based on STS Assume role but get an error which says Unable to load credentials..no creds found
. I'm not exactly sure what is happening and wondering if its possible for me to look into this source code where these creds are obtained and clients created to better understand what i'm doing wrong?
I do see similar issue https://github.com/apache/camel-kafka-connector/issues/282 and wondering how this was fixed? Do you have examples on writing custom S3 clients?
The component working in the connector is here: https://github.com/apache/camel/tree/camel-3.11.x/components/camel-aws/camel-aws2-s3
You can look at the code there.
I've tried to put sts jar in the path and still its not detected. I still get the To use web identity tokens, the 'sts' service module must be on the class path.
Error :S
FROM confluentinc/cp-server-connect:7.3.0
ADD camel-aws-sns-fifo-sink-kafka-connector-3.19.0-package.tar.gz /usr/share/confluent-hub-components
ADD https://repo1.maven.org/maven2/software/amazon/awssdk/sts/2.18.40/sts-2.18.40.jar /usr/share/confluent-hub-components/camel-aws-sns-fifo-sink-kafka-connector
ADD https://repo1.maven.org/maven2/software/amazon/awssdk/sts/2.18.40/sts-2.18.40.jar /usr/share/java
I have a AWS s3 source connector with the following configurations :
We are trying to get all the pods in our Kafka ecosystem to use a specific web identity token file based on a custom IAM role. The idea is to add IAM credentials to each of the CRDs deployed by the operator in EKS (in this case, it's Kafka connect). I do realize that the
camel.component.aws2-s3.useDefaultCredentialsProvider: true
has the connector use the default EKS node role, but we'd want it use theserviceAccountName: my-kafka-sa
. We did configure the above but got the following error when trying to run akubectl describe kafkaconnector
We checked the Kafka connect resource :
How could we get the connectors to use the specified IAM credentials instead of the default EKS node role?