microsoft / mssql-django

The Microsoft Django backend for SQL Server provides a connectivity layer for Django on SQL Server or Azure SQL DB.
Other
338 stars 112 forks source link

UUID queries are made with NCHAR #406

Open nils-van-zuijlen opened 1 month ago

nils-van-zuijlen commented 1 month ago

There are some features which are not supported yet. Please check the Limitations first to see if your bug is listed.

Software versions

Table schema and Model

class Foobar(models.Model):
    reference = models.UUIDField(_("reference"), default=uuid.uuid4, primary_key=True)

Database Connection Settings

DATABASES = {
    "default": {
        "ENGINE": "mssql",
        "NAME": os.getenv("DB_NAME"),
        "USER": os.getenv("DB_USER"),
        "PASSWORD": os.getenv("DB_PASSWORD"),
        "HOST": os.getenv("DB_HOST", "127.0.0.1"),
        "PORT": int(os.getenv("DB_PORT", "1433")),
        "OPTIONS": {
            # This database doesn't have any triggers so can use return
            # rows from bulk insert feature
            "return_rows_bulk_insert": True
        },
    }
}

Problem description and steps to reproduce

The UUID column gets created using a CHAR(24) column, and gets a corresponding primary key index, also in CHAR(24).

When executing a query filtering on that column, using Foobar.objects.get(reference=uuid.UUID(...)), the uuid gets passed as an NCHAR to the database. This makes MSSQL do a full index scan, converting all index values to NCHAR to compare them.

Expected behavior and actual behavior

I would expect the UUID to be passed as CHAR values, in order to have parameters match the index and column types without conversion.

Any other details that can be helpful

The workaround we found for our most used queries is to do them manually and insert the uuid directly into the query string, taking care to ensure only valid UUIDs can be inserted at that point to avoid SQL injection. But that approach is not scalable to all queries.

nils-van-zuijlen commented 1 month ago

This is caused by https://github.com/django/django/blob/c5d196a65264136ee6795356871a29f3d22ec52f/django/db/models/fields/__init__.py#L2674

As can be tested with the following query:

cursor.execute(
    "SELECT CONVERT(nvarchar(max), SQL_VARIANT_PROPERTY(%s, 'BaseType'))",
    (uuid.uuid4().hex,)
).fetchall()

[('nvarchar',)]

This can be fixed using native fields, as in #134, because the types will then match between the column and the query:

cursor.execute(
    "SELECT CONVERT(nvarchar(max), SQL_VARIANT_PROPERTY(%s, 'BaseType'))",
    (uuid.uuid4(),)
).fetchall()

[('uniqueidentifier',)]

mShan0 commented 1 month ago

Thanks for bringing this up. We will look into adding this.