Retry Mechanism for Log4j once the application is deployed

kdgregory / log4j-aws-appenders

Appenders for Log4J 1.2.x, Log4J 2.x, and Logback that write to AWS destinations.

Apache License 2.0

67 stars 18 forks source link

Hi, Background: I am deploying the application on Anypoint Platform which connects with AWS Cloudwatch. I have setup static IP to connect to cloudwatch and credentials are configured on Static IP as part of authorization. The problem is Cloudhub Worker (EC2 instance) is deploying the Application on Random IP and once the deployment is complete it is assigning the Static IP to the Cloudhub Worker, NOTE; Configurations are completed on Log4J and dependencies are added in POM.

I am facing the error mentioned below. For AWS Credentials the second part of Authorization is the Source IP. since the deployment is done on random IP the connect fails since the IP is not the same as Static IP. I was thinking if we have any retry mechanism where post deployment it tried to establish the connect with AWS using Static IP. if we can added any plugins on log4J or may be pass some parameters during Runtime? Any help would be great! Thank you.

2021-12-04 12:20:45,831 com-kdgregory-aws-logwriter-log4j2-cloudwatch-1 ERROR unable to configure log group/stream com.amazonaws.services.logs.model.AWSLogsException: User: arn:aws:iam::887617765038:user/mulesoft-anypoint is not authorized to perform: logs:DescribeLogGroups on resource: arn:aws:logs:eu-west-1:887617765038:log-group::log-stream: (Service: AWSLogs; Status Code: 400; Error Code: AccessDeniedException; Request ID: d31b7b74-15f9-4e0c-ac10-de7f7e4edab5; Proxy: null) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1828)

@SantoshHazari

The only "internal" solution that I can think of is to use an application-defined client factory, which would attempt some operation on the client and then sleep/retry if it fails.

I don't like this solution, however, because I believe it will delay startup for the entire logging framework -- or Log4J will decide that the appender is unusable because it's taking to long to start.

I suppose you could also create a loop at the start of your program that verifies the IP address and then re-initializes the logging framework.

However, I think an "external" solution is best: don't start your application until the EC2 instance is fully configured with static IP. This seems to be the best approach, as there may be other components that will be affected by not having a stable IP address.

Some other external solutions:

Use an API endpoint for CloudWatch Logs. This does incur a per-gigabyte charge for data transfer, but means that the logger is no longer dependent on the state of the IP. See the clientEndpoint configuration parameter.
Put your application behind a NAT. I'm assuming that your application uses static IPs because some service whitelists those IPs. I worked with a client a few years ago that did something similar, and was constantly running into issues with elastic IPs not being available during scaling events. We switched to putting our servers behind a NAT, with the NAT's IPs whitelisted, and the problem was solved.

kdgregory / log4j-aws-appenders

Retry Mechanism for Log4j once the application is deployed #149