camunda / camunda-bpm-platform

Flexible framework for workflow and decision automation with BPMN and DMN. Integration with Quarkus, Spring, Spring Boot, CDI.
https://camunda.com/
Apache License 2.0
4.12k stars 1.56k forks source link

Support new syntax of pagination in MS SQL server #3075

Closed mojtaba-khallash closed 1 year ago

mojtaba-khallash commented 1 year ago

User Story (Required on creation)

When using bpms server with sql server database in operational environment, received numerous reports of slow speed. After searching the forum, we came across this similar case, which was not done due to lack of follow-up.

Functional Requirements (Required before implementation)

Currently, row_number() used for pagination of results in MS SQL Server.

SELECT *
FROM (
    SELECT      RES.*,
            row_number() over (ORDER BY ID_ ASC) rnk
    FROM        ACT_RU_TASK RES
) SUB
WHERE       SUB.rnk >= 1 
AND     SUB.rnk < 16 
ORDER BY    SUB.rnk;

The executive plan of this query is as follows:

image


The OFFSET FETCH keyword is supported from the 2012 version, which has better performance and less physical reads.

SELECT *
FROM (
    SELECT      *
    FROM        ACT_RU_TASK
    ORDER BY    ID_
    OFFSET 0 ROWS FETCH NEXT 15 ROWS ONLY
) SUB

The executive plan of this query is as follows:

image

Comparisons have been made between these two methods.

  1. Comparing performance for different SQL Server paging methods
  2. SQL Server v.Next (Denali) : Using the OFFSET clause (paging)

Technical Requirements (Required before implementation)

The necessary commands to change the pagination method are in the DbSqlSessionFactory class and in the following variables:

We can change the pagination method by changing these variables

Links

mboskamp commented 1 year ago

Hi @mojtaba-khallash,

Please stick to the template for feature request issues. The format that the template enforces helps us make collect the information necessary to make a decision about a feature.

Specifically, it is unclear what kind of improvements a new pagination syntax for MS SQL would bring. Does this work with all the supported MS SQL versions? Please answer the questions while you fill out the feature request template.

I see you already submitted a PR, thank you very much for that. I will give some feedback in the PR but bear in mind, that we still need to make sure we understand the impact here before we decide if we want to change the syntax.

Thanks! Miklas

mojtaba-khallash commented 1 year ago

Hi @mboskamp

OFFSET-FETCH is a new feature added to MS SQL 2012. I am not an expert in this field, but this has been discussed in stackoverflow.

mboskamp commented 1 year ago

Hi @mojtaba-khallash

Sorry, but we need more information before implementing changes like this. Please apply the issue template and answer the questions. It is unclear how big of an impact the change has on performance. However, it does bring quite some risk to completely change the pagination of MS SQL Server. Before we can implement something like this a proper analysis is mandatory. Unfortunately, we can not invest the time necessary to make that analysis in the scope of a community issue without a good understanding of the possible advantages that users can expect.

If you want to drive this matter forward, we are happy to assist you. In that case, please start by filling out the issue template. :)

Thanks! Miklas

mboskamp commented 1 year ago

Hi @mojtaba-khallash, I haven't heard back from you in a while now. I will close this issue for now. Feel free to open a new one if you want to push this matter further. :)

mojtaba-khallash commented 1 year ago

Hi @mboskamp

I fixed the template issue, but I don't know exactly what else you need?

mboskamp commented 1 year ago

Hi @mojtaba-khallash, great to see that you are still interested in this issue, and thanks for providing more information in the issue description.

I think my previous comment is still valid:

It is unclear how big of an impact the change has on performance.

Can you show some numbers that show how big the performance improvement is? Please attach query plans with and without the change that show the performance improvements.

Cheers, Miklas

mojtaba-khallash commented 1 year ago

Hi @mboskamp

Single Table

I created a table with two columns, primary key and string, and then added two million random records to it.

CREATE Table largeTable
(
   id int identity primary key,
   text_col varchar(20)
);

I considered the following for evaluation:

Each of the scenarios was executed ten times and the average execution time was recorded in milliseconds

Paging syntax first page last page
old 4 447
new 2 228
Paging syntax first page last page
old 6 484
new 1 275
Paging syntax first page last page
old 1397 1799
new 449 1110
Paging syntax first page last page
old 1409 1721
new 1157 1611

Query plans of primary key column

plan-old

plan-new

Query plans of string column

plan-old

plan-new

Multiple Table

I created four tables with two columns, a primary key, and a string that have the following relationships, and then added two million random records to each of them.

image

The following request has been reviewed for testing.

SELECT * FROM (
    SELECT  *, 
        row_number() over (ORDER BY t1.id1 desc) rnk
    FROM t1
    LEFT JOIN t2 ON   t2.id2 = t1.fk2
    LEFT JOIN t3 ON   t3.id3 = t2.fk3
    LEFT JOIN t4 ON   t4.id4 = t2.fk4
) SUB
WHERE SUB.rnk >= 1000001
AND SUB.rnk < 1000011 
ORDER BY SUB.rnk
SELECT * FROM (
    SELECT  *
    FROM t1
    LEFT JOIN t2 ON   t2.id2 = t1.fk2
    LEFT JOIN t3 ON   t3.id3 = t2.fk3
    LEFT JOIN t4 ON   t4.id4 = t2.fk4
    ORDER BY t1.id1 desc 
    OFFSET 1000000 ROWS FETCH NEXT 10 ROWS ONLY
) SUB

Each of the scenarios was executed ten times and the average execution time was recorded in milliseconds.

Paging syntax first page last page
old 4 5473
new 5 2552
Paging syntax first page last page
old 4 6051
new 5 2324
Paging syntax first page last page
old 212 5204
new 208 3050
Paging syntax first page last page
old 614 5422
new 344 2851

Query plans of queries

mboskamp commented 1 year ago

Hi @mojtaba-khallash, Thank you for your answer. Unfortunately, this was not what I was looking for. You proved that OFFSET FETCH performs better over row_number() in an isolated setup which I don't doubt. What is interesting for us is how much would this change affect the Camunda queries in an actual production/test environment?

In #3076 you shared that you changed the queries in your environment and saw the improvements. Can you please share query plans for queries in that scenario? There is an example query in the CAM ticket that you linked from this issue.

Thanks, Miklas

mboskamp commented 1 year ago

@mojtaba-khallash, To reduce the number of issues that are open on our end, I will close this issue for now. This does not mean we are not interested in your proposed changes. When you can provide the information I asked for, feel free to post them here and we can reopen the issue.