Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.95k stars 305 forks source link

[BUG] Azure Kubernetes .NET Core 5.0 Unknown Error 258 #3910

Open canerdikkollu opened 12 months ago

canerdikkollu commented 12 months ago

Describe the bug

We are running a .NET Core 5.0 application in a Azure Kubernetes Service cluster. The application connects to an Azure SQL Managed Instance via private endpoint.

We are receiving the following error.

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. 
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258 
    at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) 
    at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) 
    at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) 
    at Microsoft.Data.SqlClient.SqlDataReader.TryConsumeMetaData() 
    at Microsoft.Data.SqlClient.SqlDataReader.get_MetaData() 
    at Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted) 
    at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) 
    at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method) 
    at Microsoft.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior) 
    at Microsoft.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior) 
    at System.Data.Common.DbCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)

To Reproduce

We have had many investigations regarding this error. However, we could not determine the source of the problem and the solution.

  1. There are snat port exhaustion warnings in AKS cluster.
  2. And also we have Lost carrier and Gained carrier messages in the syslog.

Expected behavior AKS and SQL connection should execute successfully every time. 🫠

Screenshots

snat_port_exhaustion

lost_carrier

Environment (please complete the following information):

Additional context Execution Timeout Expired Error (258, ReadSniSyncOverAsync) #647 Intermittent Unknown error 258 with no obvious cause #1530 SQL Server DbCommand Timeout with .Net Core container under load Azure Kubernetes .NET Core App to Azure SQL Database Intermittent Error 258

Vandersteen commented 5 months ago

Did you resolve this issue yet? We are encountering these kind of issues as well (since .net core 2.2 up to 8.0, same story)

AllenWen-at-Azure commented 5 days ago

@canerdikkollu, are you still seeing this issue? Wrt. the SNAT port exhaustion, please refer to https://learn.microsoft.com/en-us/azure/aks/load-balancer-standard#troubleshooting-snat.