Open aersam opened 3 years ago
Hi @aersamkull
SQL Server's support for Collation is limited to only these types and doesn't apply to all datatypes: BIGCHARTYPE, BIGVARCHARTYPE, TEXTTYPE, NTEXTTYPE, NCHARTYPE, or NVARCHARTYPE as documented here: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-tds/cbe9c510-eae6-4b1f-9893-a098944d430a
Due to this, the driver needs to make decisions internally to backfill applicable LCID, from locale or collation information available from server. It's possible because of this limited support, collation information is not exposed, as it's also not exposed by any other client driver for SQL Server.
Also would like to add, if not explored already, you can gather collation information using T-SQL and design your own application logic to handle specific cases: https://docs.microsoft.com/en-us/sql/relational-databases/collations/view-collation-information
Yes, it's available only to text columns and I do not expect it to return anything other then null or an exception for non-text columns.
I am not surprised that no other SQL Server client exposes this yet, as I expect this one to be the best driver ;) and also, it certainly is a low-level and rarely used feature. But in combination with #1208 has a lot of performance potential and as you can see in the Stackoverflow post is important enough for people to use reflection.
Db Metadata are really not an option for Stored Procedures. Just a short sample for a worst-cast proc:
CREATE PROC dbo.tester
@Param varchar(100)
aS
BEGIN
IF @Param='1'
BEGIN
cREATE TABLE #testTable(t varchar(100) COLLATE SQL_Latin1_General_CP1_CI_AS , #t2 varchar(100) collate Latin1_General_CI_AS)
SELECT * FROM #testTable
END
ELSE IF @PAram='2'
bEGIN
EXEC ('SELECT "sf" from "tester2"') AT [SOMESERVER};
END
ELSE IF @Param='1'
BEGIN
cREATE TABLE #testTable2(t varchar(100) /* Whats the tempdb collation? */ , #t2 varchar(100) collate Latin1_General_CI_AS)
SELECT * FROM #testTable2
END
END
You can't know the collation here until you execute it. The only way to get collation is within SqlDataReader itself
@aersamkull it's important to note that this feature would only be useful in the following scenario:
A dynamic database programming where perf would be important enough to avoid an extra encoding conversion seems like a very unlikely scenario. So IMHO this proposal would be a micro-optimization for an edge case.
@roji : It's really late so I don't have time to comment in as much detail as I would like, so I will add more tomorrow. But for the moment I will say that your first point (re: #1208) might be valid, but I don't believe your second point (re: column collation is usually known) is:
COLLATE
Also, the LCID
and ComparisonStyle
info should already be there and just need to be marked public
. At least they are in the original SqlClient (I haven't checked to see if that code was duplicated here).
I hope this info helps, and again, I will add more details here and to #1208 later today / tomorrow.
Applications don't always dictate the collation of all columns and/or the DB itself. This has nothing to do with generic/dynamic DB programming.
Apart from that, applications don't always dictate the DB schema/setup, but in my experience they do in most cases.
Even if the desire for this info were limited to generic/dynamic DB programming, that doesn't make it any less valid than the thousands of other properties and methods available in framework libraries that many people never use.
I never said it isn't valid, but prioritization of engineering resources time (and even conversation time!) must take into account the feature's value.
This is more accurately an infrequent case more than an edge case.
I'd personally estimate it at extremely infrequent, given that the need of reading strings as raw bytes has only been requested now, by one user, in the history of SqlClient, and the actual gains from avoiding the decoding is very unlikely to matter in all but the most high-perf applications. #1208 is a micro-optimization, and this issue is about making that micro-optimization usable in a wider range of scenarios.
The use case for the information is very narrow. The information is already present in the query data for any recordset so we have it internally we just don't expose a way to get to it, adding a readonly property to get the LCID or equivalent form doesn't seem like a big problem.
At least they are in the original SqlClient (I haven't checked to see if that code was duplicated here).
I couldn't see them in the public api surface but I might have missed them. Logically they should be on SqlDataReader directly or on the GetColumnSchema result somewhere.
Just had a quick look at the source and it seems the Collation is an object with flags in the protocol, not really a string. If this is done I think one had to expose a new more user friendly immutable object, basically just this: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-tds/3d29e8dc-218a-42c6-9ba4-947ebca9fd7e . With Encoding and SqlCompare Options added for simplicity
See also src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParserHelperClasses.cs#L83
Is your feature request related to a problem? Please describe.
It is not currently possible to get the collation of a column out of a SqlDataReader. This was helpful as a followup of #1208 which allows getting raw bytes of a column and a collation would help interpret it. Also see https://stackoverflow.com/questions/35190858/how-to-get-the-collation-of-string-fields-in-a-result-set-via-sqldatareader
Describe the solution you'd like
I'd like some method to get this metadata. I am currently not sure which is the best way to do it, maybe extend
GetColumnSchema
or introduce aGetCollation
method on SqlDataReader.Describe alternatives you've considered
Use database metadata Does not work with stored procedures. Though there is
sp_describe_first_result_set
, this is quite limited; eg Temporary tables do not work.Use some config from the user This is an option, but wrong config would lead to bugs that are very hard to find (as it may work if using characters from 0-127 only). Also, it's duplicate information as SqlClient need the Collation anyway.