microsoft / sql-server-samples

Azure Data SQL Samples - Official Microsoft GitHub Repository containing code samples for SQL Server, Azure SQL, Azure Synapse, and Azure SQL Edge
Other
10.07k stars 8.89k forks source link

BINARY_CHECKSUM() #1340

Open Aly071194 opened 2 months ago

Aly071194 commented 2 months ago

Hi,

Good day!

I'm currently developing a counterpart script in Snowflake for SQL Server's BINARY_CHECKSUM. Could you please explain how the BINARY_CHECKSUM() function works? Are there any internal functions involved? Additionally, is it possible to use a mathematical formula as an alternative to BINARY_CHECKSUM?

Thank you.

kfrural commented 1 month ago

Functionality of BINARY_CHECKSUM()

Calculates a binary checksum value over a row or list of expressions
Returns the same value for a row as long as it isn't modified later
Satisfies hash function properties:
    Same output for equal inputs 
    Output changes if input changes 

Key Characteristics

Ignores columns of non-comparable data types in computation
Supports varbinary(max) of any length and nvarchar(max) up to 255 characters
More precise than CHECKSUM() but with higher computational overhead
May have occasional collisions, so should only be used if application can tolerate undetected changes 

Alternatives

HASHBYTES() is more precise but has additional overhead
Snowflake standard hash functions like SHA256() can be used as alternatives