aws / aws-logging-dotnet

.NET Libraries for integrating Amazon CloudWatch Logs with popular .NET logging libraries
Apache License 2.0
299 stars 133 forks source link

AWSSDK Logger cannot fetch credentials from instance profile after upgrading to AWSSDK.Core 3.7.x #235

Closed kirillowen closed 10 months ago

kirillowen commented 10 months ago

Describe the bug

We have a windows service that is running on a Windows EC2 instance that is failing to start after upgrading the AWSSDK.Core and AWSSDK.Logger packages. The service uses Topshelf to install into Windows services, Log4Net to write logs and AWSSDK.CloudWatchLogs to push the logs to CloudWatch. After upgrading the packages the service is no longer able to install or start without the required credentials to sign into CloudWatch. If the <appender-ref ref="AWS" /> is removed from Log4Net.config file then the service installs and runs fine, logging to file/ console only.

I've upload a repo here to reproduce the issue: https://github.com/kirillowen/awssdk-log4net-repro

The log4net appender config for AWS:

    <appender name="AWS" type="AWS.Logger.Log4net.AWSAppender,AWS.Logger.Log4net">
        <LogGroup>AWSSDK.Log4Net.Repro.Test</LogGroup>
        <Region>eu-west-1</Region>
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%date [%-3thread] %-5level [%-20M] - %message%newline" />
        </layout>
        <LibraryLogFileName>C:\Services\awslog.txt</LibraryLogFileName>
    </appender>

Package versions:

  <package id="AWS.Logger.Core" version="3.2.1" targetFramework="net462" />
  <package id="AWS.Logger.Log4net" version="3.4.1" targetFramework="net462" />
  <package id="AWSSDK.CloudWatchLogs" version="3.7.302.11" targetFramework="net462" />
  <package id="AWSSDK.Core" version="3.7.300.28" targetFramework="net462" />

*Note: when updating packages to the latest versions, the issue can still be reproduced.

EC2 instance IAM role permissions for accessing logs:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "logs:Create*",
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:DescribeLogGroups",
                "logs:PutRetentionPolicy",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:logs:*:*:*",
                "arn:aws:s3:::*"
            ],
            "Effect": "Allow"
        }
    ]
}

Expected Behavior

The service, installs, starts and runs as before the package update.

Current Behavior

When installing the service through Topshelf command C:\Service\Service.exe install the install freezes and then times out.

Reproduction Steps

See repository here with minimal service to test: https://github.com/kirillowen/awssdk-log4net-repro

  1. Build Service
  2. Copy to target machine (EC2, Windows 2019 Datacenter)
  3. Run the install with .\AWSSDK.Log4Net.Repro.Service.exe install - the install will freeze
  4. Remove <appender-ref ref="AWS" /> from Log4Net.config and run the install command again - the service will install
  5. Start service with .\AWSSDK.Log4Net.Repro.Service.exe start - service will start and log to log-file.txt.
  6. Stop the service with .\AWSSDK.Log4Net.Repro.Service.exe stop
  7. Re-add AWS AWS config to Log4Net.config,
  8. Get IAM credentials for EC2 instance with:
    Invoke-WebRequest -Uri http://169.254.169.254/latest/meta-data/iam/security-credentials/<instance-profile-role>
  9. Populate the AWSAccessKey, AWSSecretKey, AWSSessionToken in AppSettings.config. The service will start and run, but throws exception to awslog.txt:
    Log Entry : 
    1/11/2024 9:50:56 AM
    :
    :Amazon.CloudWatchLogs.AmazonCloudWatchLogsException: The security token included in the request is invalid. ---> Amazon.Runtime.Internal.HttpErrorResponseException: The remote server returned an error: (400) Bad Request. ---> System.Net.WebException: The remote server returned an error: (400) Bad Request.
    at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
    at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.HttpRequest.<GetResponseAsync>d__19.MoveNext()
    --- End of inner exception stack trace ---
    at Amazon.Runtime.Internal.HttpRequest.<GetResponseAsync>d__19.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.HttpHandler`1.<InvokeAsync>d__9`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.Unmarshaller.<InvokeAsync>d__3`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.ErrorHandler.<InvokeAsync>d__5`1.MoveNext()
    --- End of inner exception stack trace ---
    at Amazon.Runtime.Internal.HttpErrorResponseExceptionHandler.HandleExceptionStream(IRequestContext requestContext, IWebResponseData httpErrorResponse, HttpErrorResponseException exception, Stream responseStream)
    at Amazon.Runtime.Internal.HttpErrorResponseExceptionHandler.<HandleExceptionAsync>d__2.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.ExceptionHandler`1.<HandleAsync>d__6.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.ErrorHandler.<ProcessExceptionAsync>d__8.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.ErrorHandler.<InvokeAsync>d__5`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.CallbackHandler.<InvokeAsync>d__9`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.Signer.<InvokeAsync>d__1`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.EndpointDiscoveryHandler.<InvokeAsync>d__2`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at Amazon.Runtime.Internal.EndpointDiscoveryHandler.<InvokeAsync>d__2`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.CredentialsRetriever.<InvokeAsync>d__7`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.RetryHandler.<InvokeAsync>d__10`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at Amazon.Runtime.Internal.RetryHandler.<InvokeAsync>d__10`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.CallbackHandler.<InvokeAsync>d__9`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.CallbackHandler.<InvokeAsync>d__9`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Amazon.Runtime.Internal.ErrorCallbackHandler.<InvokeAsync>d__5`1.MoveNext()

Possible Solution

No response

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

Targeted .NET Platform

.NET Framework 4.6.2

Operating System and version

Windows Server 2019 Datacenter; Version 1809; OS build 17763.5206

ashishdhingra commented 10 months ago

@kirillowen Good morning. Thanks for opening the issue. The error The security token included in the request is invalid. indicates that the session token for AWS credentials has expired. In your example, you appear to configure keys AWSAccessKey, AWSSecretKey and AWSSessionToken in AppSettings.config. In Configuration Files Reference for AWS SDK for .NET, these keys are not mentioned (also refer AWSConfigs API reference page). These documentation pages do mention support for keys like AWSProfileName, AWSProfilesLocation, AWSRegion, etc.

Could you please share the below information:

Thanks, Ashish

kirillowen commented 10 months ago

Hi Ashish, please see responses below:


The previous versions of the packages were:

The use of configuration keys in AppSettings.config was used to show that the keys are being read from the configuration file but not being being read from EC2 Instance Profile/ IAM Role as they have been in previous package version. The use of theys in AppSettings.config is referenced in this AWS article - https://aws.amazon.com/blogs/developer/access-key-management-for-net-applications-part-1/

Removing AWSSessionToken and using new credentials from the imds request, allows the app to start but still fails with the same "Bad Request" error. The credentials generated are set to expire at 2024-01-11T21:27:43 so that may not be the issue.

In the link you provided "In Configuration Files Reference for AWS SDK for .NET " the last option in "Credential search order" is "8. Amazon EC2 instance metadata." which is what I assume the previous version of package was using, as profile was never set on the instance, however this now appears to fail or never reach IMDS to retrieve the credentials.


No, this has not been required in the past as the credentials resolved from the EC2 Instance profile. aws configure list output on the instance shows:

PS C:\Services\awssdk-log4net> aws configure list
      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile                <not set>             None    None
access_key     ****************MEPK         iam-role
secret_key     ****************ensj         iam-role
    region                eu-west-1             imds

Does the profile now need to be set for the service to be able to read the credentials from the instance profile?

I've also just tried to add the aws configSection to App.config however the app is still freezing on start/ install. App.config now looks like this:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
    <configSections>
        <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler,log4net" />
                <section name="aws" type="Amazon.AWSSection, AWSSDK.Core"/>
    </configSections>
        <aws region="eu-west-1">
           <logging logTo="Log4Net" logResponses="Always" logMetrics="true" />
        </aws>
    <appSettings file="AppSettings.config" />
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="log4net" publicKeyToken="669e0ddf0bb1aa2a" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-2.0.15.0" newVersion="2.0.15.0" />
      </dependentAssembly>
    </assemblyBinding>
  </runtime>
</configuration>

Yes, the account runs as local system account and has full access to the folder and file.

ashishdhingra commented 10 months ago

@kirillowen Thanks for your inputs. The link https://aws.amazon.com/blogs/developer/access-key-management-for-net-applications-part-1/ you shared is dated back in 2013, which is quite old and could be referencing v1 SDK. I do not see the mentioned keys supported in v3.5 or current version of SDK.

Thanks, Ashish

kirillowen commented 10 months ago

Understood, the keys have been removed from AppSettings.config, repo updated https://github.com/kirillowen/awssdk-log4net-repro

Added the options you suggested, and got the following output from .\AWSSDK.Log4Net.Repro.Service.exe install command install-log.txt - service installed successfully, however starting the service still freezes.

IMDS v2 is set to "Optional" in the EC2 console, image and I have verified that it can be used with:

$TOKEN= (Invoke-WebRequest -Method PUT -Uri "http://169.254.169.254/latest/api/token" -Headers @{"X-aws-ec2-metadata-token-ttl-seconds" = "21600"} -UseBasicParsing) | Select-Object -ExpandProperty Content 
$instanceId= Invoke-WebRequest -Uri "http://169.254.169.254/latest/meta-data/instance-id" -Headers @{"X-aws-ec2-metadata-token" = $TOKEN} -UseBasicParsing | Select-Object -ExpandProperty Content
$instanceId

image

The "default" profile is not set, confirmed with AWS CLI on the instance:

PS C:\Services\awssdk-log4net> aws configure list
      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile                <not set>             None    None
access_key     ****************L5MF         iam-role
secret_key     ****************LpsB         iam-role
    region                eu-west-1             imds
ashishdhingra commented 10 months ago

@kirillowen In install-log.txt, I could see bunch of OperationCanceledException exceptions while invoking operations CreateLogStreamRequest service operation. I also see error Exception caught when writing error log to fileSystem.IO.DirectoryNotFoundException: Could not find a part of the path 'C:\temp\awslog.txt'.. Could you create this directory C:\temp manually?

While starting the service, you mentioned that it freezes. Do you see any error in System Diagnostics, i.e., in event viewer?

Thanks, Ashish

kirillowen commented 10 months ago

@ashishdhingra I've created the directory, however the start still freezes. After uninstalling the app and removing from the instance, and trying to install from a fresh build the install now freezes again and there is no log output to C:\temp. No, there are no errors in Event Viewer.

kirillowen commented 10 months ago

To update the thread, I have further tested the repro repository on a fresh instance of the custom AMI, another custom AMI for another service and a new instance using Windows_Server-2019-English-Full-Base-2024.01.10 AMI. From all the tests, the new version of the packages still freezes the install of the service. Restricting the instance to use IMDSv2 only, does not impact the install process and the install still hangs.

@ashishdhingra - do you have any other suggestions to get around this issue; Anything further I can assist with to verify that that the issue exists within the package?

ashishdhingra commented 10 months ago

@ashishdhingra I've created the directory, however the start still freezes. After uninstalling the app and removing from the instance, and trying to install from a fresh build the install now freezes again and there is no log output to C:\temp. No, there are no errors in Event Viewer.

@kirillowen After you created the said directory, what error did you see in the logs? If same error, did you grant account which is configured for Windows service necessary read/write permissions to directory.

Do you see any error in System Diagnostics, i.e., in event viewer

Also are you able to capture detailed verbose logs after enabling the same via below code:

Amazon.AWSConfigs.LoggingConfig.LogResponses = Amazon.ResponseLoggingOption.Always;
Amazon.AWSConfigs.LoggingConfig.LogTo = Amazon.LoggingOptions.SystemDiagnostics;
Amazon.AWSConfigs.AddTraceListener("Amazon", new System.Diagnostics.ConsoleTraceListener());

Verbose logs will capture everything, including how credentials are resolved.

I can try to reproduce the issue at my end. However, kindly note that we cannot use 3rd party tool Topshelf to reproduce the issue, would use only Microsoft provided tools to install and start Windows service.

Thanks, Ashish

kirillowen commented 10 months ago

@ashishdhingra thank you for your suggestions.

After creating the directory for aws logs, no errors were being written to the log file and nothing is being captured in System Diagnostics/ Event Viewer. I have added the LoggingConfig code to the replication repository and was not able to see any logs from this either when output was set to Console or SystemDiagnostics.

I've also tried to replicate the issue without using Topshelf as the installer and was able to see logs being sent to CloudWatch which seems to point to Topshelf being the blocker in this instance.

We are currently reviewing options to remove the Topshelf dependency and use an alternative logging library (Serilog) in the short term to proceed with the AWSSDK upgrade.

Thanks for your support and suggestions!

github-actions[bot] commented 10 months ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.