yajra / laravel-oci8

Oracle DB driver for Laravel via OCI8
https://yajrabox.com/docs/laravel-oci8
MIT License
834 stars 237 forks source link

feat: improve pagination performance #879

Open yajra opened 2 months ago

yajra commented 2 months ago

Improve pagination performance by using ROW_NUMBER() and handling default orders.

fix: #878, #870 test: pagination improvement SQL changes fix: subSelect where in query fix: pagination when using fromSub

yajra commented 2 months ago

Requesting for review @alexc-jeromes and @mixaster, TIA!

mixaster commented 2 months ago

So, @yajra I implemented this in our app to check, and got some new errors on previously working queries, specifically an "Invalid Identifier" in the outer query on the "tablename.*" column.

Also, the initial query that was giving me trouble still took more than 30 seconds.

mixaster commented 2 months ago

Is there a reason why there's a wrapper query? if I take my regular query and just add something like "FETCH FIRST 20 ROWS ONLY", it comes back in 0.28 seconds on a table of 281K records (in SQL Developer).

Honest question, I really don't know too much about the inner workings of Oracle.

mixaster commented 2 months ago

@yajra I did some experimenting....

I replaced compileRowConstraint with the following:

protected function compileRowConstraint($query)
    {
        if ($query->limit == 1 && is_null($query->offset)) {
            return "fetch next {$query->limit} rows only";
        }

        if ($query->offset && is_null($query->limit)) {
            return "offset {$query->offset} rows";
        }

        return "offset {$query->offset} rows fetch next {$query->limit} rows only";
    }

and then compileTableExpression just does a

return "{$sql} {$constraint}";

It seems to work throughout my app in initial testing. Looking for a sanity check. I could make a PR if you'd like.

yajra commented 2 months ago

Unfortunately, there is a need to support a lower Oracle version down to 10g which is why I can't use fetch syntax.

I considered making this configurable and pagination will depend on the stated Oracle version but I couldn't execute it yet.

yajra commented 2 months ago

So, @yajra I implemented this in our app to check, and got some new errors on previously working queries, specifically an "Invalid Identifier" in the outer query on the "tablename.*" column.

Also, the initial query that was giving me trouble still took more than 30 seconds.

Can you provide a snippet to reproduce the invalid identifier error?

Based on my testing, some of my queries only improved by a few milliseconds. But I will still have to check on a bigger application.

mixaster commented 2 months ago

@yajra This QueryBuilder call:

$studies = QueryBuilder::for(Study::query()->withUserCount())
            ->withCount(['sites'])
            ->paginate(getPerPage(10))
            ->withQueryString();

Produced this SQL:

select "STUDIES".*, (select count(*) from "STUDY_USER" inner join "MODEL_HAS_ROLES" on "STUDY_USER"."USER_ID" = "MODEL_HAS_ROLES"."MODEL_ID" inner join "ROLES" on "ROLES"."ID" = "MODEL_HAS_ROLES"."ROLE_ID" where "ROLES"."NAME" in (*****) and "STUDY_USER"."STATUS" = active and "STUDY_USER"."STUDY_ID" = "STUDIES"."ID") as "STUDY_USER_COUNT", (select count(*) from "SITES" inner join "STUDY_SITE" on "SITES"."ID" = "STUDY_SITE"."SITE_ID" where "STUDIES"."ID" = "STUDY_SITE"."STUDY_ID" and "SITES"."DELETED_AT" is null) as "SITES_COUNT" from (select t1.*, row_number() over (order by "NAME" asc) as rn from (select "STUDIES".*, (select count(*) from "STUDY_USER" inner join "MODEL_HAS_ROLES" on "STUDY_USER"."USER_ID" = "MODEL_HAS_ROLES"."MODEL_ID" inner join "ROLES" on "ROLES"."ID" = "MODEL_HAS_ROLES"."ROLE_ID" where "ROLES"."NAME" in (?, ?, ?, ?, ?) and "STUDY_USER"."STATUS" = ? and "STUDY_USER"."STUDY_ID" = "STUDIES"."ID") as "STUDY_USER_COUNT", (select count(*) from "SITES" inner join "STUDY_SITE" on "SITES"."ID" = "STUDY_SITE"."SITE_ID" where "STUDIES"."ID" = "STUDY_SITE"."STUDY_ID" and "SITES"."DELETED_AT" is null) as "SITES_COUNT" from "STUDIES" where "STUDIES"."DELETED_AT" is null order by "NAME" asc) t1) where rn between 1 and 10

and the error was ORA-00904: "STUDIES": invalid identifier

yajra commented 2 months ago

I adjusted the table alias to use the original table in $query->from to preserve the compatibility in selected columns. Can you please check again @mixaster.

I will also add some tests using Eloquent with relation count when I can.

Thanks!

yajra commented 2 months ago

Initial tests result vs 1.1k records:

Before:

select t2.* from ( select rownum AS "rn", t1.* from (select "ROLES".*, (select count(*) from "USERS" inner join "ROLE_USER" on "USERS"."ID" = "ROLE_USER"."USER_ID" where "ROLES"."ID" = "ROLE_USER"."ROLE_ID") as "USERS_COUNT", (select count(*) from "PERMISSIONS" inner join "PERMISSION_ROLE" on "PERMISSIONS"."ID" = "PERMISSION_ROLE"."PERMISSION_ID" where "ROLES"."ID" = "PERMISSION_ROLE"."ROLE_ID") as "PERMISSIONS_COUNT" from "ROLES" order by "ID" desc) t1 ) t2 where t2."rn" between 1 and 10

After:

select "ROLES".*, (select count(*) from "USERS" inner join "ROLE_USER" on "USERS"."ID" = "ROLE_USER"."USER_ID" where "ROLES"."ID" = "ROLE_USER"."ROLE_ID") as "USERS_COUNT", (select count(*) from "PERMISSIONS" inner join "PERMISSION_ROLE" on "PERMISSIONS"."ID" = "PERMISSION_ROLE"."PERMISSION_ID" where "ROLES"."ID" = "PERMISSION_ROLE"."ROLE_ID") as "PERMISSIONS_COUNT" from (select "ROLES".*, row_number() over (order by "ID" desc) as rn from (select "ROLES".*, (select count(*) from "USERS" inner join "ROLE_USER" on "USERS"."ID" = "ROLE_USER"."USER_ID" where "ROLES"."ID" = "ROLE_USER"."ROLE_ID") as "USERS_COUNT", (select count(*) from "PERMISSIONS" inner join "PERMISSION_ROLE" on "PERMISSIONS"."ID" = "PERMISSION_ROLE"."PERMISSION_ID" where "ROLES"."ID" = "PERMISSION_ROLE"."ROLE_ID") as "PERMISSIONS_COUNT" from "ROLES" order by "ID" desc) "ROLES") "ROLES" where rn between 1 and 10
mixaster commented 2 months ago

I had to wait until we had our dev environment available to test... unfortunately, the pagination was still slow with this version.

This query:

select
  *
from
  (
    select
      "ACTIVITY_LOG".*,
      row_number() over (
        order by
          "CREATED_AT" asc
      ) as rn
    from
      (
        select
          *
        from
          "ACTIVITY_LOG"
        order by
          "CREATED_AT" asc
      ) "ACTIVITY_LOG"
  ) "ACTIVITY_LOG"
where
  rn between 1
  and 20

with 800K rows took 23 seconds.

alexc-jeromes commented 2 months ago

@mixaster do you have indexes on that table? Curious to see if/how it changes things. Our tables were 5M or so.

mixaster commented 2 months ago

I do have indexes. it's the inner select I think that's causing the slowdown.

yajra commented 2 months ago

Oic, we could trim down the query to a single inner select. Something like:

select *
from (select users.*, row_number() over (order by rowid) as rn from USERS) users
where rn between 1 and 10

I will try to implement it.

@mixaster, can you confirm if this trimmed query will improve your use case?

mixaster commented 2 months ago

There's still a full scan, so in my test of the raw query it took over 14 seconds.

yajra commented 1 month ago

Apologies, I accidentally tagged v11.6.1 on this branch.

On the good side, this PR got tested and seemed to break https://github.com/yajra/laravel-oci8/issues/891.