dotnet / SqlClient

Microsoft.Data.SqlClient provides database connectivity to SQL Server for .NET applications.
MIT License
846 stars 281 forks source link

Integrated security on Linux requires a port number or protocol prefix in the connection string #627

Closed ltrzesniewski closed 4 years ago

ltrzesniewski commented 4 years ago

Describe the bug

When trying to use SqlClient on Linux with integrated security, I noticed that the following connection string:

Would result in the following error when trying to connect:

Cannot authenticate using Kerberos. Ensure Kerberos has been initialized on the client with 'kinit' and a Service Principal Name has been registered for the SQL Server to allow Kerberos authentication. ErrorCode=InternalError, Exception=Interop+NetSecurityNative+GssApiException: GSSAPI operation failed with error - Unspecified GSS failure. Minor code may provide more information (Server not found in Kerberos database).

This connection string works fine on Windows.

The intriguing part is that the following connection strings work just fine on Linux as well:

The machine is correctly configured and attached to the Active Directory with Kerberos. The user has a valid TGT.

After a quick look at the code, I guess the following condition may need to be updated:

https://github.com/dotnet/SqlClient/blob/ead806e87a15be9f54c99db2ebe8918025c20fc9/src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SNI/SNIProxy.cs#L336

Here's the full stack trace:

Microsoft.Data.SqlClient.SqlException (0x80131904): Cannot authenticate using Kerberos. Ensure Kerberos has been initialized on the client with 'kinit' and a Service Principal Name has been registered for the SQL Server to allow Kerberos authentication.
ErrorCode=InternalError, Exception=Interop+NetSecurityNative+GssApiException: GSSAPI operation failed with error - Unspecified GSS failure.  Minor code may provide more information (Server not found in Kerberos database).
   at System.Net.Security.NegotiateStreamPal.GssInitSecurityContext(SafeGssContextHandle& context, SafeGssCredHandle credential, Boolean isNtlm, SafeGssNameHandle targetName, GssFlags inFlags, Byte[] buffer, Byte[]& outputBuffer, UInt32& outFlags, Int32& isNtlmUsed)
   at System.Net.Security.NegotiateStreamPal.EstablishSecurityContext(SafeFreeNegoCredentials credential, SafeDeleteContext& context, String targetName, ContextFlagsPal inFlags, SecurityBuffer inputBuffer, SecurityBuffer outputBuffer, ContextFlagsPal& outFlags)
   at Microsoft.Data.SqlClient.SNI.SNIProxy.GenSspiClientContext(SspiClientContextStatus sspiClientContextStatus, Byte[] receivedBuff, Byte[]& sendBuff, Byte[] serverName)
   at Microsoft.Data.SqlClient.SNI.TdsParserStateObjectManaged.GenerateSspiClientContext(Byte[] receivedBuff, UInt32 receivedLength, Byte[]& sendBuff, UInt32& sendLength, Byte[] _sniSpnBuffer)
   at Microsoft.Data.SqlClient.TdsParser.SNISSPIData(Byte[] receivedBuff, UInt32 receivedLength, Byte[]& sendBuff, UInt32& sendLength)
   at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParser.SSPIError(String error, String procedure)
   at Microsoft.Data.SqlClient.TdsParser.SNISSPIData(Byte[] receivedBuff, UInt32 receivedLength, Byte[]& sendBuff, UInt32& sendLength)
   at Microsoft.Data.SqlClient.TdsParser.SSPIData(Byte[] receivedBuff, UInt32 receivedLength, Byte[]& sendBuff, UInt32& sendLength)
   at Microsoft.Data.SqlClient.TdsParser.TdsLogin(SqlLogin rec, FeatureExtension requestedFeatures, SessionData recoverySessionData, Nullable`1 fedAuthFeatureExtensionData)
   at Microsoft.Data.SqlClient.SqlInternalConnectionTds.Login(ServerInfo server, TimeoutTimer timeout, String newPassword, SecureString newSecurePassword)
   at Microsoft.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean ignoreSniOpenTimeout, TimeoutTimer timeout, Boolean withFailover)
   at Microsoft.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString connectionOptions, SqlCredential credential, TimeoutTimer timeout)
   at Microsoft.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance)
   at Microsoft.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData, Boolean applyTransientFaultHandling, String accessToken, DbConnectionPool pool)
   at Microsoft.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions)
   at Microsoft.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnectionPool pool, DbConnection owningObject, DbConnectionOptions options, DbConnectionPoolKey poolKey, DbConnectionOptions userOptions)
   at Microsoft.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
   at Microsoft.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
   at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
   at Microsoft.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
   at Microsoft.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
   at Microsoft.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry, SqlConnectionOverrides overrides)
   at Microsoft.Data.SqlClient.SqlConnection.Open(SqlConnectionOverrides overrides)
   at Microsoft.Data.SqlClient.SqlConnection.Open()

To reproduce

This can be reproduced with a simple test program run on Linux (just replace SQL_TEST with a valid SQL Server host name):

using System;
using Microsoft.Data.SqlClient;

namespace SqlTest
{
    internal static class Program
    {
        private static int Main()
        {
            try
            {
                using var connection = new SqlConnection("Data Source=SQL_TEST;Integrated Security=true");
                connection.Open();

                using var command = new SqlCommand("SELECT GETDATE();", connection);
                var date = (DateTime)command.ExecuteScalar();

                Console.WriteLine("OK");
                Console.WriteLine(date);
                return 0;
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex);
                return 1;
            }
        }
    }
}

Expected behavior

I expect the Data Source=SQL_TEST;Integrated Security=true connection string to work exactly the same way as Data Source=SQL_TEST,1433;Integrated Security=true on Linux.

Further technical details

Microsoft.Data.SqlClient version: 2.0.0 .NET target: .NET Core 3.1 SQL Server version: SQL Server 2019 Operating system: Ubuntu 18.04 LTS

cheenamalhotra commented 4 years ago

Hi @ltrzesniewski

I do see this documentation: https://docs.microsoft.com/en-us/previous-versions/system-center/configuration-manager-2007/bb632649(v=technet.10)?redirectedfrom=MSDN#required-communication-protocols

But I'm not sure if that would mean there's no support for Named Pipes for Kerberos in driver.

@saurabh500 Are you aware of any such limitation for Named Pipes where Kerberos is not supported? Or should we attempt to fix this?

ltrzesniewski commented 4 years ago

Hi @cheenamalhotra,

I'm trying to connect with TCP here, not with named pipes.

cheenamalhotra commented 4 years ago

The intriguing part is that the following connection strings work just fine on Linux as well: Data Source=SQL_TEST,1433;Integrated Security=true Data Source=tcp:SQL_TEST;Integrated Security=true

This is TCP, default connection mechanism is Named Pipes for driver.

cheenamalhotra commented 4 years ago

Also - If you disable Named Pipes & Shared Memory on your target server, then driver will default to TCP.

ltrzesniewski commented 4 years ago

Oh OK, I thought that the driver first tried to connect with TCP and then tried named pipes if TCP failed.

I guess I'd still expect the driver to try TCP if named pipes fail or to only try TCP if Kerberos is not supposed to work with named pipes.

ltrzesniewski commented 4 years ago

I checked with the server admin, and it turns out that named pipes are actually disabled on this server.

I also made a quick check with sqlcmd -S np:SQL_TEST -E on Windows, and I get the following error:

Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : Named Pipes Provider: Could not open a connection to SQL Server [2]. .
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : Login timeout expired.
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online..

Which confirms that named pipes are disabled.

I made sure that sqlcmd -S tcp:SQL_TEST -E connects fine. Actually, sqlcmd -S SQL_TEST -E (with out the tcp: part) also works fine both from the Windows and Linux machines.

cheenamalhotra commented 4 years ago

There's also Shared Memory protocol which is chosen default if you're connecting to SQL Server on the same machine. You can try to disable that to ensure you connect only with TCP.

But as you said sqlcmd works and SqlClient does not, that does not look right then. We will investigate and get back to you.

ltrzesniewski commented 4 years ago

Thanks!

That SQL Server is on a different machine, so I did not ask the server admin if the shared memory protocol is enabled or not.

saurabh500 commented 4 years ago

We do rely on the port number to create the SPN to authenticate to failing which we create a default SPN.

I quickly browsed the code at https://github.com/dotnet/SqlClient/blob/master/src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SNI/SNIProxy.cs#L344

With Kerberos Auth (which is the only kind of integrated auth supported on Linux in the driver) we construct the SPN of the server using the connectivity information in the following format.

MSSQLSVC/hostname_with_reverse_dnsLookup:port

If no port is specified then we try MSSQLSVC/hostname_with_reverse_dnsLookup

For SQL running on default port, we expect that the SQL Server admin has registered the SPN MSSQLSVC/hostname_with_reverse_dnsLookup in the AD along with MSSqlSvc/hostname_with_reverse_dnsLookup:1433 Typically for default port we expect the following two SPN formats.

MSSQLSVC/hostname_with_reverse_dnsLookup MSSQLSVC/hostname_with_reverse_dnsLookup:port

On Windows this is not a problem because the Kerberos auth fails and the driver falls back to NTLM and there is no SPN in the picture there. Hence it seems more seamless, but for Kerberos we do depend on the SPN for SQL to be registered correctly.

ltrzesniewski commented 4 years ago

Hi @saurabh500,

The server was initially registered with SPNs which all included the port number - the admin told me this was the best practice.

Then we tried adding SPNs without the port, but the SqlClient didn't seem to connect better with this change. We only tested that quickly though, so we may have missed something.

I'll show your message to the server admin on Monday, and I guess we'll double-check if the SPN without the port number was registered properly, and also if we are really inadvertently using NTLM on Windows.

saurabh500 commented 4 years ago

Hello @ltrzesniewski

I'll show your message to the server admin on Monday, and I guess we'll double-check if the SPN without the port number was registered properly, and also if we are really inadvertently using NTLM on Windows.

Sounds good and it will be interesting to know more about this experiment.

Then we tried adding SPNs without the port, but the SqlClient didn't seem to connect better with this change. We only tested that quickly though, so we may have missed something.

I have observed that it takes sometimes for the new SPN registration to propagate, but I don't know what causes it. How long did you wait to try integrated authentication after you registered the SPN without the port number?

I do agree that an SPN without a port number may not be desirable.

@cheenamalhotra It might be interesting to see the behavior of sqlcmd on Linux. What it's behavior with Integrated Authentication is, when it comes to default port specification (i.e without 1433 and with it) Since sqlcmd uses the ODBC driver, this would let us know what the behavior of the ODBC driver is.

ltrzesniewski commented 4 years ago

How long did you wait to try integrated authentication after you registered the SPN without the port number?

Yes, we were aware there's a propagation time, and we waited a few minutes after the change, but maybe that wasn't enough. We'll double check that.

ltrzesniewski commented 4 years ago

Here are the additional test results.

Adding SPNs without the port number:

The other day, we have added the following SPNs without the port number:

MSSQLSvc/SQL_TEST
MSSQLSvc/SQL_TEST.<fqdn>

That didn't work. But the SQL_TEST name is actually a CNAME, not the real machine name returned by reverse DNS.

So today, we added the SPNs with MachineName (the real host name the reverse DNS lookup returns), to get the following in addition to the SPNs with the port number:

MSSQLSvc/SQL_TEST
MSSQLSvc/SQL_TEST.<fqdn>
MSSQLSvc/MachineName
MSSQLSvc/MachineName.<fqdn>

That SPN config allows SqlClient to connect fine from Linux when using the Data Source=SQL_TEST;Integrated Security=true connection string.

I can see the MSSQLSvc/MachineName.<fqdn>@<REALM> entry in the klist output (without a port number).

NTLM on Windows:

We ran the following query on our servers, where Windows clients are connected using connection strings with Data Source=CNAME;Integrated Security=true:

SELECT auth_scheme FROM sys.dm_exec_connections

None of the connected clients used NTLM (except for a single shared-memory connection). All SPNs configured on these servers include the port number.

Summary

To sum up, if we have a SQL Server with port numbers in all of its SPNs, and a connection string in the form Data Source=CNAME;Integrated Security=true:

What is your recommendation as to the SPN configuration? Are we supposed to add SPNs without the port number, or should SqlClient be able to connect with this configuration?

cheenamalhotra commented 4 years ago

Hi @saurabh500

It might be interesting to see the behavior of sqlcmd on Linux. What it's behavior with Integrated Authentication is, when it comes to default port specification (i.e without 1433 and with it) Since sqlcmd uses the ODBC driver, this would let us know what the behavior of the ODBC driver is.

I checked with ODBC team internally, and they default to 1433 if no port is specified.

If no port is specified, we should probably try: MSSQLSVC/hostname_with_reverse_dnsLookup:1433

@ltrzesniewski This is probably a sqlclient bug, since sqlcmd is able to connect.

cheenamalhotra commented 4 years ago

@ltrzesniewski

Could you test again with this driver: Microsoft.Data.SqlClient.2.1.0-dev-i627.zip (Included fix to default to 1433 port)

<PackageReference Include="Microsoft.Data.SqlClient" Version="2.1.0-dev-i627" />

ltrzesniewski commented 4 years ago

Thanks @cheenamalhotra !

I can confirm that this version works fine on Linux, and connects to a server with the port number in its SPN, when using a connection string like Data Source=CNAME;Integrated Security=true.

Will this change be backported to System.Data.SqlClient as well?

cheenamalhotra commented 4 years ago

Hi @ltrzesniewski

Thanks for confirming. It depends on the impact of the issue and the bar is very high for backporting fixing to S.D.S since it's now in LTS servicing mode. Are you/your organization dependent on this issue and is it a blocker?

ltrzesniewski commented 4 years ago

No worries, it's not a blocker since adding tcp: to the data source in the connection string works around the issue.

Thanks!

saurabh500 commented 4 years ago

@cheenamalhotra i agree with bug evaluation.