Closed dannycranmer closed 1 year ago
Have we tested that this works to resolve the specific failure mode specified here? https://github.com/aws/aws-msk-iam-auth/issues/36#issuecomment-1070563135
If so, we might want to mention it in the overview
Updated description wrt testing
Issue #, if available:
https://github.com/aws/aws-msk-iam-auth/issues/36
Description of changes:
This change fixes the classloader issue experienced in this issue. I have introduced a new
SaslClientFactory
calledClassLoaderAwareIAMSaslClientFactory
registered under theAWS_MSK_IAM
mechanism. The existingIAMSaslClientFactory
has been moved to a new mechanism nameAWS_MSK_IAM.<classloader>
. EachIAMLoginModule
will register both mechanisms, we will end up with oneClassLoaderAwareIAMSaslClientFactory
per JVM andn
copies ofIAMSaslClientFactory
,n
equals the number of ClassLoaders used.ClassLoaderAwareIAMSaslClientFactory
picks the correctIAMSaslClientFactory
based on the classloader of theCallbackHandler
For single classloader environments, there will be 1
ClassLoaderAwareIAMSaslClientFactory
and 1IAMSaslClientFactory
. We will incur a negligible performance hit delegating from one to the other.Testing
I have tested against MSK using a Flink cluster running on EC2 box. I verified this issue resolves issue #36. I am waiting for additional test feedback from other stakeholders, and am hoping that this can be verified against MSK e2e tests.
Concerns:
Since we register
n
copies ofIAMSaslClientFactory
there is an opportunity for this to infinitely grow and leak, for example if a Flink job is continuously failing over. However, in this situation the job is already broken and this tradeoff might be acceptable against the current state, where multi classloader environments are not supported.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.