microsoft / go-mssqldb

Microsoft SQL server driver written in go language
BSD 3-Clause "New" or "Revised" License
284 stars 65 forks source link

String output parameter value is cut off #161

Closed El-76 closed 8 months ago

El-76 commented 12 months ago

Describe the bug

The driver has to specify parameter maximum length TDS field along with its type when calling stored procedure. Right now for string parameters the driver sets NVarchar as type and provided value length as max length (see types.go). While its acceptable for input parameters it doesn't work correctly for output parameters - if the output parameter value written by the stored procedure is longer than initial one the written value appears to be cut off.

To Reproduce

Create Test stored procedure that accepts POut NVARCHAR output parameter and writes its value back with 'Value' suffix added.

Call the procedure using standard way of specifying output parameters: sql.Named("POut", sql.Out{Dest: &sql.NullString{String: "Test", Valid: true}, In: true})

"Test" value will be returned.

Expected behavior

"TestValue" value is returned.

Further technical details

More to follow.

Additional context None.

shueybubbles commented 12 months ago

Does it work if you use mssql.NVarCharMax instead of string ?

shueybubbles commented 12 months ago

FWIW we need to overhaul how the driver handles strings in order to work with collations too (eg #129) I'd be interested in hearing from developers how we can define string parameter types that allow for encoding of more data like collation and input/output buffer sizes. Nothing that's "Go idiomatic" comes to my mind, given the current sql/driver model. I don't think the shared driver model has solved more basic issues like a common decimal type or common guid type, has it?

El-76 commented 12 months ago

Yes, it works with mssql.NVarCharMax instead of string.

But the problem is that now I have a function that accepts sql.NullString in order to support NULL values.

If I have to use NVarCharMax then I wont be able to use single parameter type, but do smth like interface{} then cast to either NullString or NVarCharMax.

Btw, what if the driver always sets max length (8000 or anything reasonable for NVARCHAR?) for variable length parameters for OUT parameters only?

What disadvantages you see in such approach except potential performance issues maybe?

shueybubbles commented 11 months ago

For OUT parameters using 8k is probably fine for most cases and could be a reasonable short term mitigation. Long term, I think we need to come up something full featured like SqlClient has for SqlParameter, at least for strings. I don't know how else we can support the breadth of SQL features like collations and Always Encrypted. #46 could use some community input.

El-76 commented 11 months ago

Well, thank you!

If I file a PR that implements logic above (sets predefined length for var length OUT params), will it have any chances to be merged?

If yes, could you please provide some hints on how to implement if any? Its quite straightforward change, but it can be done in 100 different ways and if you have some future plans to correspond with - it would be great.

El-76 commented 11 months ago

In fact, I suggest just stealing the code from JDBC driver:

    void writeRPCStringUnicode(String sName, String sValue, boolean bOut,
            SQLCollation collation) throws SQLServerException {
        boolean bValueNull = (sValue == null);
        int nValueLen = bValueNull ? 0 : (2 * sValue.length());
        // Textual RPC requires a collation. If none is provided, as is the case when
        // the SSType is non-textual, then use the database collation by default.
        if (null == collation)
            collation = con.getDatabaseCollation();

        /*
         * Use PLP encoding if either OUT params were specified or if the user query exceeds
         * DataTypes.SHORT_VARTYPE_MAX_BYTES
         */
        if (nValueLen > DataTypes.SHORT_VARTYPE_MAX_BYTES || bOut) {
            writeRPCNameValType(sName, bOut, TDSType.NVARCHAR);

effectively applying the following changes to types.go (pseudocode, needs passing out parameter to the function):

diff --git a/types.go b/types.go
index cf9640e..bf73b28 100644
--- a/types.go
+++ b/types.go
@@ -220,7 +220,7 @@ func writeVarLen(w io.Writer, ti *typeInfo) (err error) {
                typeNVarChar, typeNChar, typeXml, typeUdt:

                // short len types
-               if ti.Size > 8000 || ti.Size == 0 {
+               if ti.Size > 8000 || ti.Size == 0 || out {
                        if err = binary.Write(w, binary.LittleEndian, uint16(0xffff)); err != nil {
                                return
                        }
shueybubbles commented 11 months ago

I only see a handful calls to writeTypeInfo that would need to be updated to pass the new out parameter so it seems like a reasonably scoped and safe fix. The PR would need to include a test for the scenario as well as for the Always Encrypted variant for the encrypted out parameter.

El-76 commented 11 months ago

Well, when I run the following stored procedure (col8 is a test table column from alwaysencrypted_test.go):

CREATE PROCEDURE mssqlAep371
  @p1 NVARCHAR(30) OUTPUT
AS
  DECLARE @newp1 nvarchar(30)
  SET @newp1 = @p1
  SET @p1 = (SELECT TOP 1 col8 FROM mssqlAe371)
  UPDATE mssqlAe371 SET col8 = @newp1

using the following code:

p1 := "test"
_, _ = conn.Exec(procName, sql.Named("p1", sql.Out{ Dest: &p1 }))

I receve the following errors:

tds_test.go:387: 2023-11-06 22:21:20.4647471 +0300 MSK m=+2.955818601 [token.go:1110]: got ERROR 206 Operand type clash: nvarchar(30) encrypted with (encryption_type = 'RANDOMIZED', encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 'mssqlCek371', column_encryption_key_database_name = 'test') is incompatible with nvarchar(4) encrypted with (encryption_type = 'RANDOMIZED', encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 'mssqlCek371', column_encryption_key_database_name = 'test')
tds_test.go:387: 2023-11-06 22:21:20.4647471 +0300 MSK m=+2.955818601 [token.go:1110]: got ERROR 33514 Incorrect parameter encryption metadata was received from the client. The error occurred during the invocation of the batch and therefore the client can refresh the parameter encryption metadata by calling sp_describe_parameter_encryption and retry.

The DB complains about parameter length mismatch and the problem is I don't know how to fix it even in theory - the client is not aware of the required length (if I set type to nvarchar(max) as for unencrypted case the error just reoccurs).

Could you please advise, I'm likely missing something...

shueybubbles commented 11 months ago

what is the query text being sent to sp_describe_parameter_encryption ?

shueybubbles commented 11 months ago

My approach to figuring out what the driver should be doing for AE is to try to replicate the scenario in SSMS with auto-parameterization turned on. I use Tools/Options/Output Window to turn on sqlclient traces, and an XEvent profiler session to see all queries SqlClient sends under the covers.

EG if you connect to the same database using SSMS with AE enabled and do something like this in the query window:

declare @p1 nvarchar(4) = "test"
mssqlAep371 @p1 output

Does it work? And what is the query that SSMS sends with sp_describe_parameter_encryption?

El-76 commented 11 months ago

what is the query text being sent to sp_describe_parameter_encryption ?

tds_test.go:387: 2023-11-06 22:21:20.4485868 +0300 MSK m=+2.939658301 [mssql.go:528]: sp_describe_parameter_encryption
tds_test.go:387: 2023-11-06 22:21:20.4491029 +0300 MSK m=+2.940174401 [mssql.go:533]:   @tsql   EXEC [mssqlAep371] @p1=@p1 OUTPUT
tds_test.go:387: 2023-11-06 22:21:20.4491029 +0300 MSK m=+2.940174401 [mssql.go:533]:   @params @p1 nvarchar(4) output

I'll answer later regarding SSMS, need some time...

El-76 commented 11 months ago

Btw, while I am trying SSMS... There are two thoughts regarding the subject:

  1. SSMS will always send type parameters that correspond to the declared vars, e.g.:
DECLARE @p1 nvarchar(30) = 'test'

EXEC mssqlAep371 @p1 = @p1 OUTPUT

I guess the declared type size will be sent.

  1. If we take a look at https://stackoverflow.com/questions/74916169/sql-server-always-encrypted-with-the-jdbc-not-working , its clear that the client has to send exact type sizes.

As a (preliminary) conclusion - it's impossible to implement AE NVARCHAR parameter using current sql library model since args do not contain size info at all (the size is either deduced by the driver using actual value provided or set to its max which is acceptable for non-AE cases , but not acceptable for AE).

shueybubbles commented 11 months ago

yeah that's about what I expected. There are 2 important aspects of strings that the Go sql/driver model doesn't address.

  1. Character encoding/collations/code pages. See https://github.com/microsoft/go-mssqldb/discussions/46
  2. Sizing for OUTPUT buffers.

I'd love to provide a solution but I don't know what developers want the solution to look like. The most straightforward approach is to create a whole new data type like SqlParameter in .Net. I'd prefer that data type be defined in sql package, though, not be go-mssqldb specific.

A shortcut solution for 2 could be to accept a slice of runes as the parameter instead of string. Then it would use cap(p) as the data type length instead of len(p). It might be something to use generics for to support output buffers of any type.

El-76 commented 11 months ago

Answering to your request above regarding SSMS (the procedure name is slightly different since its generated):

  1. It doesn't work, the same error:
    Msg 206, Level 16, State 2, Procedure mssqlAep19, Line 0 [Batch Start Line 0]
    Operand type clash: nvarchar(30) encrypted with (encryption_type = 'RANDOMIZED', encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 'mssqlCek19', column_encryption_key_database_name = 'test') is incompatible with nvarchar(4) encrypted with (encryption_type = 'RANDOMIZED', encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 'mssqlCek19', column_encryption_key_database_name = 'test')
  2. The following query in sp_describe_parameter_encryption is sent:
    exec sp_describe_parameter_encryption N'DECLARE @p1 AS NVARCHAR (4) = @pfb88c3a484014c92a94c9a0406960bac;
    EXECUTE mssqlAep19 @p1 = @p1 OUTPUT;
    ',N'@pfb88c3a484014c92a94c9a0406960bac nvarchar(4)'
El-76 commented 11 months ago

In fact Im cowardly suggesting leaving AE support for output parameters out of the issue scope since it requires much more work including reviewing library model (comparing just to passing a bool flag through) :)

Regarding using cap() as a type length - Im not authority here, but I personally think its not ideal solution since:

  1. Its not explicit
  2. If it (suddenly!) appears we have to pass more than one parameter for type (precision? scale?) to have exact match then cap() obviously won't be an option

Please let me know if I could file a PR fixing OUT parameter w/o AE support.

shueybubbles commented 11 months ago

sure, as long as AE doesn't get any worse than it is now. It's a relatively new feature in the driver so I don't think people are yet clamoring for fixes in it.

El-76 commented 9 months ago

I am sorry for off topic, but I found nothing regarding how to get proper access rights to create branches in the repo to file a PR... Could you please help? Or do I have to fork?

shueybubbles commented 9 months ago

you'll need to fork