directus / directus

The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.
https://directus.io
Other
27.38k stars 3.82k forks source link

Slow schema introspection query with Azure SQL #19486

Closed KonstantinSolv closed 2 months ago

KonstantinSolv commented 1 year ago

Describe the Bug

We are self hosting with Docker v10.5.2 on Azure. We use Web App and Azure SQL (Elastic Pools). Both services have sufficient resources and we are not stressing any limits.

I checked the database and we have 1010 entries in dbo.directus_fields.

After every login:

image

It takes around 12 seconds to GET fields:

image

After the initial wait everything is blazing fast. But every login takes 12 sec.

After debugging it became clear that this issue is caused due to everytime querying schema introspection:

023-08-22T09:06:27.701684086Z: [INFO] [09:06:27.176] TRACE: [11070.976ms] select

2023-08-22T09:06:27.701724787Z: [INFO] [o].[name] AS [table],

2023-08-22T09:06:27.701731587Z: [INFO] [c].[name] AS [name],

2023-08-22T09:06:27.701736587Z: [INFO] [t].[name] AS [data_type],

2023-08-22T09:06:27.701741887Z: [INFO] [c].[max_length] AS [max_length],

2023-08-22T09:06:27.701746788Z: [INFO] [c].[precision] AS [numeric_precision],

2023-08-22T09:06:27.701751488Z: [INFO] [c].[scale] AS [numeric_scale],

2023-08-22T09:06:27.701756188Z: [INFO] CASE WHEN [c].[is_nullable] = 0 THEN

2023-08-22T09:06:27.701761088Z: [INFO] 'NO'

2023-08-22T09:06:27.701765488Z: [INFO] ELSE

2023-08-22T09:06:27.701770088Z: [INFO] 'YES'

2023-08-22T09:06:27.701774788Z: [INFO] END AS [is_nullable],

2023-08-22T09:06:27.701779689Z: [INFO] object_definition ([c].[default_object_id]) AS [default_value],

2023-08-22T09:06:27.701784689Z: [INFO] [i].[is_primary_key],

2023-08-22T09:06:27.701789489Z: [INFO] [i].[is_unique],

2023-08-22T09:06:27.701794089Z: [INFO] CASE [c].[is_identity]

2023-08-22T09:06:27.701798589Z: [INFO] WHEN 1 THEN

2023-08-22T09:06:27.701802989Z: [INFO] 'YES'

2023-08-22T09:06:27.701807389Z: [INFO] ELSE

2023-08-22T09:06:27.701811789Z: [INFO] 'NO'

2023-08-22T09:06:27.701816190Z: [INFO] END AS [has_auto_increment],

2023-08-22T09:06:27.701821090Z: [INFO] OBJECT_NAME ([fk].[referenced_object_id]) AS [foreign_key_table],

2023-08-22T09:06:27.701825990Z: [INFO] COL_NAME ([fk].[referenced_object_id],

2023-08-22T09:06:27.701830790Z: [INFO] [fk].[referenced_column_id]) AS [foreign_key_column],

2023-08-22T09:06:27.701835690Z: [INFO] [cc].[is_computed] as [is_generated],

2023-08-22T09:06:27.701852791Z: [INFO] [cc].[definition] as [generation_expression] from [directus].[sys].[columns] [c] JOIN [sys].[types] [t] ON [c].[user_type_id] = [t].[user_type_id] JOIN [sys].[tables] [o] ON [o].[object_id] = [c].[object_id] JOIN [sys].[schemas] [s] ON [s].[schema_id] = [o].[schema_id] LEFT JOIN [sys].[computed_columns] AS [cc] ON [cc].[object_id] = [c].[object_id] AND [cc].[column_id] = [c].[column_id] LEFT JOIN [sys].[foreign_key_columns] AS [fk] ON [fk].[parent_object_id] = [c].[object_id] AND [fk].[parent_column_id] = [c].[column_id] LEFT JOIN (

2023-08-22T09:06:27.701859291Z: [INFO] SELECT

2023-08-22T09:06:27.701863691Z: [INFO] [ic].[object_id],

2023-08-22T09:06:27.701868491Z: [INFO] [ic].[column_id],

2023-08-22T09:06:27.701873091Z: [INFO] [ix].[is_unique],

2023-08-22T09:06:27.701877791Z: [INFO] [ix].[is_primary_key],

2023-08-22T09:06:27.701882492Z: [INFO] MAX([ic].[index_column_id]) OVER(partition by [ic].[index_id], [ic].[object_id]) AS index_column_count,

2023-08-22T09:06:27.701887392Z: [INFO] ROW_NUMBER() OVER (

2023-08-22T09:06:27.701891992Z: [INFO] PARTITION BY [ic].[object_id], [ic].[column_id]

2023-08-22T09:06:27.701896992Z: [INFO] ORDER BY [ix].[is_primary_key] DESC, [ix].[is_unique] DESC

2023-08-22T09:06:27.701901692Z: [INFO] ) AS index_priority

2023-08-22T09:06:27.701906492Z: [INFO] FROM

2023-08-22T09:06:27.701911192Z: [INFO] [sys].[index_columns] [ic]

2023-08-22T09:06:27.701915992Z: [INFO] JOIN [sys].[indexes] AS [ix] ON [ix].[object_id] = [ic].[object_id]

2023-08-22T09:06:27.701921693Z: [INFO] AND [ix].[index_id] = [ic].[index_id]

2023-08-22T09:06:27.701926593Z: [INFO] ) AS [i]

2023-08-22T09:06:27.701931293Z: [INFO] ON [i].[object_id] = [c].[object_id]

2023-08-22T09:06:27.701935893Z: [INFO] AND [i].[column_id] = [c].[column_id]

2023-08-22T09:06:27.701940593Z: [INFO] AND ISNULL([i].[index_column_count], 1) = 1

2023-08-22T09:06:27.701945393Z: [INFO] AND ISNULL([i].[index_priority], 1) = 1 where [s].[name] = @p0 [dbo]

To Reproduce

  1. Login
  2. Wait 12 seconds
  3. Then everything ok

Directus Version

v10.5.2

Hosting Strategy

Self-Hosted (Docker Image)

br41nslug commented 1 year ago

Could you set LOG_LEVEL=trace to see if there are longrunning SQL queries slowing things down on the server?

KonstantinSolv commented 1 year ago

I believe I can't use Profiler and set LOG_LEVEL=trace because its a Azure SQL database.

I see the load peeking to 25% on the SQL server tough during this login/fields getting period.

br41nslug commented 1 year ago

What flavor of SQL did they use for Azrue SQL? Or better put which driver are you using to connect to it? as it is not a natively supported vendor for directus. Why would you not be able to set the LOG_LEVEL=trace Directus environment variable?

KonstantinSolv commented 1 year ago

Did some debugging with help of br41nslug and we found that the following query basically takes all the time. It's schema introspection from [directus].[sys].[columns].

I understood that Directus could not do anything about it as its caused by Azure SQL cloud database.

2023-08-22T09:06:27.701684086Z: [INFO]  [09:06:27.176] TRACE: [11070.976ms] select 

2023-08-22T09:06:27.701724787Z: [INFO]          [o].[name] AS [table],

2023-08-22T09:06:27.701731587Z: [INFO]          [c].[name] AS [name],

2023-08-22T09:06:27.701736587Z: [INFO]          [t].[name] AS [data_type],

2023-08-22T09:06:27.701741887Z: [INFO]          [c].[max_length] AS [max_length],

2023-08-22T09:06:27.701746788Z: [INFO]          [c].[precision] AS [numeric_precision],

2023-08-22T09:06:27.701751488Z: [INFO]          [c].[scale] AS [numeric_scale],

2023-08-22T09:06:27.701756188Z: [INFO]          CASE WHEN [c].[is_nullable] = 0 THEN

2023-08-22T09:06:27.701761088Z: [INFO]            'NO'

2023-08-22T09:06:27.701765488Z: [INFO]          ELSE

2023-08-22T09:06:27.701770088Z: [INFO]            'YES'

2023-08-22T09:06:27.701774788Z: [INFO]          END AS [is_nullable],

2023-08-22T09:06:27.701779689Z: [INFO]          object_definition ([c].[default_object_id]) AS [default_value],

2023-08-22T09:06:27.701784689Z: [INFO]          [i].[is_primary_key],

2023-08-22T09:06:27.701789489Z: [INFO]          [i].[is_unique],

2023-08-22T09:06:27.701794089Z: [INFO]          CASE [c].[is_identity]

2023-08-22T09:06:27.701798589Z: [INFO]          WHEN 1 THEN

2023-08-22T09:06:27.701802989Z: [INFO]            'YES'

2023-08-22T09:06:27.701807389Z: [INFO]          ELSE

2023-08-22T09:06:27.701811789Z: [INFO]            'NO'

2023-08-22T09:06:27.701816190Z: [INFO]          END AS [has_auto_increment],

2023-08-22T09:06:27.701821090Z: [INFO]          OBJECT_NAME ([fk].[referenced_object_id]) AS [foreign_key_table],

2023-08-22T09:06:27.701825990Z: [INFO]          COL_NAME ([fk].[referenced_object_id],

2023-08-22T09:06:27.701830790Z: [INFO]            [fk].[referenced_column_id]) AS [foreign_key_column],

2023-08-22T09:06:27.701835690Z: [INFO]          [cc].[is_computed] as [is_generated],

2023-08-22T09:06:27.701852791Z: [INFO]          [cc].[definition] as [generation_expression] from [directus].[sys].[columns] [c] JOIN [sys].[types] [t] ON [c].[user_type_id] = [t].[user_type_id] JOIN [sys].[tables] [o] ON [o].[object_id] = [c].[object_id] JOIN [sys].[schemas] [s] ON [s].[schema_id] = [o].[schema_id] LEFT JOIN [sys].[computed_columns] AS [cc] ON [cc].[object_id] = [c].[object_id] AND [cc].[column_id] = [c].[column_id] LEFT JOIN [sys].[foreign_key_columns] AS [fk] ON [fk].[parent_object_id] = [c].[object_id] AND [fk].[parent_column_id] = [c].[column_id] LEFT JOIN (

2023-08-22T09:06:27.701859291Z: [INFO]            SELECT

2023-08-22T09:06:27.701863691Z: [INFO]              [ic].[object_id],

2023-08-22T09:06:27.701868491Z: [INFO]              [ic].[column_id],

2023-08-22T09:06:27.701873091Z: [INFO]              [ix].[is_unique],

2023-08-22T09:06:27.701877791Z: [INFO]              [ix].[is_primary_key],

2023-08-22T09:06:27.701882492Z: [INFO]              MAX([ic].[index_column_id]) OVER(partition by [ic].[index_id], [ic].[object_id]) AS index_column_count,

2023-08-22T09:06:27.701887392Z: [INFO]              ROW_NUMBER() OVER (

2023-08-22T09:06:27.701891992Z: [INFO]                PARTITION BY [ic].[object_id], [ic].[column_id]

2023-08-22T09:06:27.701896992Z: [INFO]                ORDER BY [ix].[is_primary_key] DESC, [ix].[is_unique] DESC

2023-08-22T09:06:27.701901692Z: [INFO]              ) AS index_priority

2023-08-22T09:06:27.701906492Z: [INFO]            FROM

2023-08-22T09:06:27.701911192Z: [INFO]              [sys].[index_columns] [ic]

2023-08-22T09:06:27.701915992Z: [INFO]            JOIN [sys].[indexes] AS [ix] ON [ix].[object_id] = [ic].[object_id]

2023-08-22T09:06:27.701921693Z: [INFO]              AND [ix].[index_id] = [ic].[index_id]

2023-08-22T09:06:27.701926593Z: [INFO]          ) AS [i]

2023-08-22T09:06:27.701931293Z: [INFO]          ON [i].[object_id] = [c].[object_id]

2023-08-22T09:06:27.701935893Z: [INFO]          AND [i].[column_id] = [c].[column_id]

2023-08-22T09:06:27.701940593Z: [INFO]          AND ISNULL([i].[index_column_count], 1) = 1

2023-08-22T09:06:27.701945393Z: [INFO]          AND ISNULL([i].[index_priority], 1) = 1 where [s].[name] = @p0 [dbo]
DanielBiegler commented 1 year ago

According to Microsoft Docs Azure SQL seems to be using the Microsoft SQL Server database engine. I'd love to see a comparison between different engines doing the same query to determine if its an issue specific to MS SQL or in general after enough collections/fields.

@KonstantinSolv so you have roughly 1000 fields across how many collections? We'd need to make a comparable snapshot to investigate further. Thanks so much for your work looking into the specific query!

wewerak47 commented 7 months ago

We have the same issue. The schema query is taking as much as 15s to load. We are using Azure SQL Standard S1 20 DTU. We have around 640 fields/columns across 73 collections/tables including the Directus fields and collections. I restored the database, just for some testing and in the restored database, there is totally different execution plan and it completes in a few miliseconds. I attached the plans, so you can check it out. Directus_ExecPlan.zip

pmorin17 commented 5 months ago

We're also having an issue similar to this. We're self-hosting Directus with the docker image on Azure with a MSSQL database S1 10 DTU. After the login, the request "/fields" takes about 30 seconds. The same thing also happens everytime we create or edit a field.

Restoring the database helps fixing this issue, but only for a brief moment.

We have 58 collections with a total of 454 fields. Only our development environments seems to be affected for now.

This issue is making it difficult for us to work with Directus.

alexvdvalk commented 4 months ago

When this occurs, have you got a Redis cache setup on your Directus project?

wewerak47 commented 4 months ago

When this occurs, have you got a Redis cache setup on your Directus project?

We do not use Redis. or any other caching.

boring-joey commented 4 months ago

We currently have around 30 collections and are experiencing the same issue in our development environment. We are building towards a commercial license website but are encountering this problem whenever we edit the schema for each field we add or modify.

We researched this issue and found the same query in the Azure Portal under long-running queries: image

When we run this query in Azure Data Studio, it consistently takes around 11 seconds to complete. We understand that the query retrieves schema information from system tables, limiting our ability to optimize with indexes. However, upon splitting the query into parts, we identified the following segment as the primary cause of the delay:

SELECT [ic].[object_id], [ic].[column_id], [ix].[is_unique], [ix].[is_primary_key], MAX([ic].[index_column_id]) OVER (PARTITION BY [ic].[index_id], [ic].[object_id]) AS [index_column_count], ROW_NUMBER() OVER ( PARTITION BY [ic].[object_id], [ic].[column_id] ORDER BY [ix].[is_primary_key] DESC, [ix].[is_unique] DESC ) AS [index_priority] FROM [sys].[index_columns] [ic] JOIN [sys].[indexes] AS [ix] ON [ix].[object_id] = [ic].[object_id] AND [ix].[index_id] = [ic].[index_id] ) AS [i] ON [i].[object_id] = [c].[object_id] AND [i].[column_id] = [c].[column_id] AND ISNULL([i].[index_column_count], 1) = 1 AND ISNULL([i].[index_priority], 1) = 1

image

When this segment is removed, the query execution time drops significantly to: "Total execution time: 00:00:00.022"

We hope this information helps, as it is currently very challenging to update our schema in Directus.

boring-joey commented 4 months ago

I did some digging and found a better query that handles NULLs inside the LEFT JOIN. The result is identical, but the query now runs in x milliseconds. The pull request needs some additional checks and tests.

jaads commented 3 months ago

The problem still exists