compasjs / compas

Unified backend tooling
https://compasjs.com
MIT License
12 stars 8 forks source link

code-gen(sql): increase performance when the same `viaXxx` entity is used in `where.$or` clauses #1889

Closed dirkdev98 closed 1 year ago

dirkdev98 commented 2 years ago

This may produce better query plans in some cases, but probably breaks the nested the selects that we have currently.

dirkdev98 commented 2 years ago

A more flexible way would be to accept joins: {} in the query builder, allowing the user to use other generated where helpers. And even use that join in a custom orderBy clause. This may however conflict if we ever want to create things like aggregation helpers.

queryPost({
  joins: {
    tags: {
      type: "innerJoin", // default "leftJoin",
      where: {
        isRecommendedTag: true,
      },
      shortName: "joinedTags",
    },
  },
  where: {
     $raw: query`(${tagWhere({
        isPublic: true
      }, "joinedTags."))`
  }
});

join: { type: "inner" } vs where: { via }

queryPost({ joins: { tags: { type: "innerJoin", } } }).exec(sql);
// -> SELECT p.* FROM "post" p INNER JOIN "tags" t ON t.id = p."tag";

queryPost({ where: { viaTags: {} } }).exec(sql);
// -> SELECT p.* FROM "post" p WHERE p."tag" = ANY (select "id" FROM "tag" t)

Both of these cases will be fast with Postgres. There are different performance characteristics when dealing with many more records or when traversing / joining many tables. The most notable change is however that it is now up to the Compas end user to decide which variant to use. I think that this is unavoidable in the long run. The general recommendation would be to never start with joins except when necessary for a custom orderBy or to deduplicate traversals in the where-clause.

/cc @tjonger , @bjarn

bjarn commented 2 years ago

Nice one! I like it.

Maybe the following syntax might be nice, when reading it feels like a sentence: "query post, inner join tags, where is recommended tag" and this of course will be the same for other types of joins.

queryPost({
  innerJoin: {
    tags: {
      where: {
        isRecommendedTag: true,
      },
      shortName: "joinedTags",
    },
  },
  where: {
     $raw: query`(${tagWhere({
        isPublic: true
      }, "joinedTags."))`
  }
});
dirkdev98 commented 2 years ago

Good suggestion. Seems like a plan.

Another thing we may add is detection for duplicate shortName usage. To prevent unexpected results in the cases that Postgres allows it, or to prevent Postgres from returning an error if we can statically check that already.

dirkdev98 commented 2 years ago

The above idea has a main problem; duplicate result sets for oneToMany joins. This is preventable with for example an DISTINCT ON or GROUP BY. However, both prevent us from using a custom order by which in turn breaks one of the requirements: allowing the user to sort on a joined result.

As mentioned above we need something like custom joins for when necessary for a custom orderBy or to deduplicate traversals in the where-clause. Let's see if we can do this another way. I am renaming this issue to be a bit more clear about the issue.