Open brettcave opened 3 years ago
Same issue with the snowplow/scala-stream-collector-kinesis:2.5.0 docker image. I'm getting the issue regardless of whether I specify iam
or default
. Maybe snowplow is just using an old version of the AWS SDK? It should be fixed in version 2.10.11 of java SDK v2 but I'm not sure which version they are using.
The only thing i can think of possibly is that the volume mount for the token / secret completes after the service starts
I don't think that's the cause. I tested by deploying the container with a overriden entrypoint:
command: ["sleep"]
args: ['3600']
I exec'd into the pod, made sure the env vars were set properly, and manually executed the command, yet I got the same issue.
Did some more investigation and found out that this issue should have been fixed already by https://github.com/snowplow/stream-collector/pull/170. I double-checked and confirmed that the sts jar file is in the docker image and it's specified in the classpath too. I also confirmed it has the necessary SDK version (1.12.128). It's really weird that this issue is still happening, @istreeter @mmathias01 any ideas?
Hi @caleb15 I have been testing this out today and with:
aws {
accessKey = default
secretKey = default
}
... and an OIDC IAM role ARN attached to the pod service-account I could successfully write out to the target stream. If you use the iam
inputs it indeed does not work but the DefaultAWSCredentialsProviderChain
with v2.5.0 does work as far as I can tell.
Are you certain that the ServiceAccount you have attached to the service is correctly configured and attached?
Weird, this time default works. I could've sworn I tested it before and default didn't work :|
Sorry about that, thanks for testing!
Just ran into this issue again when recreating the pod even though I have it set to default
. Same issue with snowplow kinesis enrichment. I'll try making my own pod with sudo rights and awscli so I can look further into it.
Nevermind, turns out I got the error because the trust relationship in the IAM role referenced serviceaccount's old name ~ I had updated the serviceaccount to have a new name, but didn't realize I would also need to update the IAM role too because I thought all you needed was the ARN reference. Turns out you need to make sure the ARN annotation in the service account matches and that the service account name matches the name in the trust relationship.
Some more debugging tips: When your IAM role is working you should be able to do aws sts get-caller-identity
and get something like the following:
{
"UserId": "<censored>:botocore-session-<censored>",
"Account": "<censored>",
"Arn": "arn:aws:sts::<censored>:assumed-role/<role-name>/botocore-session-<censored>"
}
You should also make sure the serviceaccount name in the pod matches the name of the serviceaccount:
kf get pod/<podname> -o yaml | grep serviceAccount
Note the container doesn't have root privileges so you can't install awscli. I made my own container with root privileges and used that instead. However, I realized there's a far easier way: just set runAsUser
in the securityContext to 0 (root). That way you can install whatever packages you need to debug.
Is this still an active issue? I was looking for info about k8s and the scala collector and stumbled upon this, and it's unclear why the issue is still 'open'
When using snowplow stream collector scala (version 2.4.1) in kubernetes in AWS, the authentication does not work as expected.
I have tried the following steps to get it working:
snowplow/scala-stream-collector-kinesis
for my own container image that has AWS CLI installed, and am able to validate that I am assuming the role correctly (aws sts get-caller-identity
)After deploying the collector, I see the following errors:
So I can see that the role being assumed by the service is the IAM Instance Profile of the underlying node (which is restricted), and not the IAM Role for Service account.
I have variations on the
aws
snippet in the configmap:and
However, based on https://github.com/snowplow/stream-collector/blob/master/kinesis/src/main/scala/com.snowplowanalytics.snowplow.collectors.scalastream/sinks/KinesisSink.scala#L432, I believe the first is the 1 that should be used, as it would trigger the
DefaultAWScredentialProviderChain()
, and with the SDK being 1 that supports IRSA, it should pick up the Web Identity token as a higher priority (3) than EC2 instance profile (6). https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.htmlBut for some reason it is not, not sure why.
I have also tried adding the following to the pod security context, as I have seen issues with accessing tokens before:
However, I don't think this is related, because the container I used for testing was able to access the token and assume the IRSA role by default, and not the IAM instance profile of the node.
edit
from the scala collector container:
I have tried some variations on the security contexts, e.g. to map the token group
The only thing i can think of possibly is that the volume mount for the token / secret completes after the service starts, not sure if this is possible or the best way to validate.