drizzle-team / drizzle-orm

Headless TypeScript ORM with a head. Runs on Node, Bun and Deno. Lives on the Edge and yes, it's a JavaScript ORM too 😅
https://orm.drizzle.team
Apache License 2.0
23.84k stars 592 forks source link

[BUG]: ordering produces extremely high reads and low performance. #1653

Open titong0 opened 10 months ago

titong0 commented 10 months ago

What version of drizzle-orm are you using?

0.29.0

What version of drizzle-kit are you using?

No response

Describe the Bug

with the following schemas:

export const debatesTable = mysqlTable(
  "debates",
  {
    id: v4IDWithDefault("id").primaryKey(),
    name: varchar("name", { length: 255 }).unique().notNull(),
  },
);
export const debatesRelations = relations(debatesTable, ({ one, many }) => ({
  posts: many(postsTable),
}));

export const postsTable = mysqlTable(
  "posts",
  {
    id: v4IDWithDefault("id").primaryKey(),
    authorId: v4Id("author_id").notNull(),
    content: text("content").notNull(),
    debateId: v4Id("debate_id").notNull(),
    createdAt: timestamp("created_at").defaultNow().notNull(),
    likeCount: mediumint("like_count").default(0).notNull(),
  },
  (table) => {
    return {
      debateIdIdx: index("debate_id_idx").on(table.debateId),
      // this is DESC but added manually because drizzle does not support it yet
      createdAtIdx: index("created_at_idx").on(table.createdAt),
      createdAtDebateIdIdx: index("created_at_debate_id_idx").on(
        table.debateId,
        table.createdAt
      ),
    };
  }
);

export const postsRelations = relations(postsTable, ({ one, many }) => ({
  author: one(usersTable, {
    fields: [postsTable.authorId],
    references: [usersTable.id],
  }),
  debate: one(debatesTable, {
    fields: [postsTable.debateId],
    references: [debatesTable.id],
  }),
}));

The problem happens when running the following query:

db.query.debatesTable.findFirst({
    where: (debate, { eq }) => eq(debate.name, debateName),
    with: {
      posts: {
        orderBy: desc(postsTable.createdAt),
        limit: 16,
      },
    },
  });

among other things, there's a part in the raw SQL that looks like this:

FROM
  (SELECT *, row_number() OVER (
      ORDER BY `debatesTable_posts`.`created_at` DESC)
      FROM `posts` `debatesTable_posts`
      WHERE `debatesTable_posts`.`debate_id` = `debatesTable`.`id`
      LIMIT 16) `debatesTable_posts`), json_array()) AS `posts`

I faced extremely high row reads and I contacted planetscale support, and the person helping me pointed out that calling row_number() seemed to make the engine look into every row. I tried to run a modified version of the query that looked like this:

FROM
  (SELECT *
   FROM `posts` `debatesTable_posts`
   WHERE `debatesTable_posts`.`debate_id` = `debatesTable`.`id`
   LIMIT 16 ORDER BY `debatesTable_posts`.`created_at` DESC) `debatesTable_posts`), json_array()) AS `posts`

and I got around 37 row reads, as opposed to the 21.000 I was getting before.

Expected behavior

Less row reads

Environment & setup

Planetscale, drizzle 0.29.0

Angelelz commented 10 months ago

Related to #1249. The use of some window functions are causing some issues