agiletiger / ojotas

The database-first ORM
MIT License
7 stars 4 forks source link

feat: provide performance hints #13

Open nicoabie opened 10 months ago

nicoabie commented 10 months ago

at compile time we could analyze queries against a prod instance and provide useful insights to the user. like the need to create indexes.

more on the next comment

Crain-32 commented 10 months ago

More information regarding the idea.

It would be to include an optimization check into the Compile Time check. My recommendation would be some modifications in the Type Generation to include additional metadata that can be discarded after the compile time check is done. I'd recommend leaving the scope to single column indexes while ironing it out, and then maybe target multi-column indexes. Single Column indexes are very common, so having the DB Library recommend them shouldn't be too bad.

Do note that Postgres states there is no silver bullet for Indexes, in an ideal world we'd probably hook into a production DB, check if some variant of ANALYZE exists, and run it for all the queries. Instead this feature is to suggest good places for developers less familiar with DBA where to look.

A practical example should help showcase the idea. We'll have two tables to support a Customer/Order Relationship.

CREATE TABLE customer (
     id   serial   PRIMARY KEY,
     name text     NOT NULL,
     age  integer  
);
CREATE TABLE order (
     id       serial      PRIMARY KEY,
     customer integer     NOT NULL,
     total    varchar(64) NOT NULL DEFAULT '0.00',
     currency varchar(3)  NOT NULL DEFAULT 'USD',
     FOREIGN KEY (customer)
         REFERENCES customer (id)
);

Now lets add in some queries to reference. (Apologizes for any bad syntax, I rarely just yolo SQL like this)

-- $1 would be a customer ID. Query in general is "Get total order count by customer Id"
SELECT COUNT(order.id) FROM order INNER JOIN customer on order.customer=customer.id WHERE customer.id = $1 

-- $1 is a currency code, Query in general is "Get average customer age by order currency"
SELECT AVG(customer.id) FROM customer JOIN order on order.customer=customer.id WHERE order.currency = $1

Hopefully that is enough to work with, since I can't easily come up with more examples quickly.

In this case, ojotas would recommend an index on the order.customer column. We can see there is no index there, and both of the queries the application would run (in this small example) utilize a JOIN on that column. This points towards an area to review for "free" performance.

Naturally any hint system should have an opt-out feature, since if you do the review and find no reason to utilize the tip, you don't want to see it anymore. I'd recommend the following addition to the ojotasrc.json

{
  "hints": {
      "enabled": boolean,
      "ignore": [
          {
             "table": string,
             "column": string?,
             "hint_type": enum
          }
      ]
   }
}

Where if enabled is true, then ignore factors in. If hints is false, then no hint scanning is done. When hints are enabled, than a declared table, or table/column pair can be excluded from specific hints (Assuming more hints get added.

As an example, lets display a configuration option to disable hint scanning on our order table, and then our order/customer Table Column Pair.

{
  "hints": {
      "enabled": true,
      "ignore": [
          {
             "table": "order",
             "column": null,
             "hint_type": null
          }
      ]
   }
}
{
  "hints": {
      "enabled": true,
      "ignore": [
          {
             "table": "order",
             "column": "customer",
             "hint_type": "INDEX"
          }
      ]
   }
}

column could be adjusted to columns if it is preferred to have a string[] instead.

nicoabie commented 10 months ago

Another thing we could do related to this is that if we are able to analyze al the sql queries, we could suggest the user to create composite indexes that could satisfy many queries instead of having many secondary indexes.

nicoabie commented 10 months ago

Also we could detect cases of index obfuscation

ivanlolivier commented 9 months ago

Another idea, show a warning when using OR in the ON clause of a JOIN

nicoabie commented 9 months ago

Maybe suggest to use union all for each branch

nicoabie commented 9 months ago

video on index obfuscation https://planetscale.com/learn/courses/mysql-for-developers/queries/redundant-and-approximate-conditions?autoplay=1 and how to mitigate it