damilola-dealdey / SQL-Track

0 stars 0 forks source link

Exercise 6: Using EXPLAIN to Optimize SQL Queries #6

Open damilola-dealdey opened 8 years ago

damilola-dealdey commented 8 years ago

https://github.com/damilola-dealdey/SQL-Track/tree/master/Ex6

What do different columns in the output of EXPLAIN mean? What possible values can those have? What is the meaning of those values? We use EXPLAIN to optimize slow SQL queries used in our application. Let’s say we have a comments table in our application that has a foreign key, user_id, referencing to users table. EXPLAINing the query that finds all the comments made by a user gives us following result. mysql> EXPLAIN SELECT * FROM comments WHERE user_id = 41; +-------------+------+---------------+---------+-------+---------+-------------+| select_type | type | key | key_len | ref | rows | Extra |+-------------+------+---------------+---------+-------+---------+-------------+| SIMPLE | ALL | NULL | NULL | NULL | 1002345 | Using where |+-------------+------+---------------+---------+-------+---------+-------------+

mysql> SELECT count(id) FROM comments; +-----------+| count(id) |+-----------+| 1002345 |+-----------+

The value under 'rows' column in the output of EXPLAIN query and SELECT query after it are same. What does it mean? Is the SELECT query optimal? If no, how do we optimize it?

In Rails world, we have something called polymorphic associations (http://guides.rubyonrails.org/association_basics.html#polymorphic-associations). Let’s say in our web application, we can let users comment on photographs and articles. Some of the rows in comments table are represented as following:

mysql> SELECT * FROM comments LIMIT 5; +----+------------------+----------------+---------+| id | commentable_type | commentable_id | user_id |+----+------------------+----------------+---------+| 1 + Article | 1 | 1 |+----+------------------+----------------+---------+| 2 + Photo | 1 | 1 |+----+------------------+----------------+---------+| 3 + Photo | 2 | 2 |+----+------------------+----------------+---------+| 4 + Photo | 2 | 2 |+----+------------------+------------ ----+---------+| 5 + Article | 1 | 2 |+----+------------------+----------------+---------+

When we need to fetch comments of a user on a particular Article or Photo we form a query like:

mysql> EXPLAIN SELECT * FROM comments WHERE commentable_id = 1 AND commentable_type = 'Article' AND user_id = 1; +-------------+------+---------------+---------+-------+---------+-------------+| select_type | type | key | key_len | ref | rows | Extra |+-------------+------+---------------+---------+-------+---------+-------------+| SIMPLE | ALL | NULL | NULL | NULL | 1000025 | Using where |+-------------+------+---------------+---------+-------+---------+-------------+

It seems that we do not have any index on any of the columns. And whole comments table is scanned to fetch those comments. We decide to index columns in comments table to optimize the SELECT query. What column(s) will you index in which order? Ask the exercise creator for a hint if you are confused.

EXPLAIN a SELECT query against one of your databases which employs an INNER JOIN between two tables. What does the output look like? What do the values under different columns mean? Do you get only one row in EXPLAIN's output? Form the same select query in above question using a subquery instead of a JOIN. What does the EXPLAIN output look like now? Which query is better and why?

segunadeleye commented 8 years ago

Let's meet and discuss.