awslabs / amazon-kinesis-producer

Amazon Kinesis Producer Library
Apache License 2.0
399 stars 331 forks source link

(bug) Segfault when trying to connect to Kinesis via EKS IRSA in 0.15.9 #558

Open joshuabaird opened 8 months ago

joshuabaird commented 8 months ago

Running the producer in an EKS pod using IRSA that allows access to Kinesis.

The producer segfaults with this error:

Description: Segmentation fault
Signal: SIGSEGV Code: (1) Address: 18446744073709551608

The thought was that it was swallowing an error related to the inability to authenticate to the Kinesis endpoint. I have verified authorization is correct by using the AWS CLI on the container (which uses the EKS IRSA credential chain). I don't see any "access denied" API calls to Kinesis in Cloudtrail either. I have confirmed there is no network traffic from the container to Kinesis or Cloudwatch.

I also bumped producer logging up to TRACE, but don't see anything helpful.

Any ideas on how to further troubleshoot this?

Thanks!

joshuabaird commented 8 months ago

After debugging and tips from this SO, it seems like there is a bug in v0.15.9 that causes the segfault. Reverting to v0.15.8 fixes the issue.

Using this to build the KPL config:

    private static WebIdentityTokenCredentialsProvider webIdentityTokenCredentialsProvider =
        WebIdentityTokenCredentialsProvider.builder()
            .roleArn(System.getenv("AWS_ROLE_ARN"))
            .roleSessionName(UUID.randomUUID().toString())
            .webIdentityTokenFile(System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"))
            .build();       

    private static final KinesisProducerConfiguration kplConfig = new KinesisProducerConfiguration()
                                                                      .setAggregationEnabled(false)
                                                                      .setRegion("us-east-1")
                                                                      .setLogLevel("trace")
                                                                      .setCredentialsProvider(webIdentityTokenCredentialsProvider);

I see that 0.15.10 is available (not sure if it fixes the issue or not) -- but it hasn't yet been released to Maven Central. It does look like the only fix in 0.15.10 was related to something that was introduced in 0.15.9, though.

I have opened AWS case #170656583201515 and provided debug/trace logs.

SigmaQ commented 8 months ago

I'm having the same problem, thanks for the details

elrob commented 8 months ago

Same problem here.

yinghong commented 8 months ago

Yes, we ran into the same issue. pin to 0.15.8 for now. Thanks!

pcolazurdo commented 8 months ago

Same issue here (not IRSA related, though) , running on Amazon Linux 2023 and openjdk 21.0.2 2024-01-16

madorb commented 3 months ago

Love the amazing comms from AWS on this! :-\

Is this expected to be fixed in 0.15.10?