standardebooks / web

The source code for the Standard Ebooks website.
https://standardebooks.org
Creative Commons Zero v1.0 Universal
234 stars 61 forks source link

Search! Implement FilterEbooks via DB #379

Closed colagrosso closed 2 months ago

colagrosso commented 2 months ago

I based this on Library::FilterArtworks(), but this one will have higher traffic, so I'd appreciate the extra attention. I've included sample queries below of the different SQL that different searches and sort orders will generate. If you'd like me to try a certain case, send me an /ebooks link, e.g.,

https://standardebooks.org/ebooks?page=2&query=boy&sort=reading-ease&tags%5B0%5D=fiction

and I'll provide the corresponding query.

You don't have to go through each case below. I think they look reasonable—especially the default—and performance is good on my machine.

Default for /ebooks, no tags, empty query, and sorted by release date

EXPLAIN SELECT distinct e.*                                           
                from Ebooks e                                                         
                where true                                                            
                order by e.EbookCreated desc
                limit 12                                                              
                offset 0;
+------+-------------+-------+-------+---------------+-------------------+---------+------+------+-------+
| id   | select_type | table | type  | possible_keys | key               | key_len | ref  | rows | Extra |
+------+-------------+-------+-------+---------------+-------------------+---------+------+------+-------+
|    1 | SIMPLE      | e     | index | NULL          | index2            | 5       | NULL | 12   |       |
+------+-------------+-------+-------+---------------+-------------------+---------+------+------+-------+

Query for "spain"

EXPLAIN SELECT distinct e.*                                           
                from Ebooks e
                where true AND match(e.IndexableText) against('spain') 
                order by e.EbookCreated desc            
                limit 12        
                offset 0;
+------+-------------+-------+----------+---------------+-----------+---------+------+------+-----------------------------+
| id   | select_type | table | type     | possible_keys | key       | key_len | ref  | rows | Extra                       |
+------+-------------+-------+----------+---------------+-----------+---------+------+------+-----------------------------+
|    1 | SIMPLE      | e     | fulltext | idxSearch     | idxSearch | 0       |      | 1    | Using where; Using filesort |
+------+-------------+-------+----------+---------------+-----------+---------+------+------+-----------------------------+

Empty query, sorted by length, no index on WordCount, but I think that's ok

EXPLAIN SELECT distinct e.*                                           
                from Ebooks e                                                         
                where true                                                            
                order by e.WordCount                                                  
                limit 12
                offset 0;
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+
|    1 | SIMPLE      | e     | ALL  | NULL          | NULL | NULL    | NULL | 531  | Using filesort |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+

"Memoir" tag selected, sort by author name

EXPLAIN SELECT distinct e.*                                           
                from Ebooks e   
                inner join Contributors con using (EbookId)
                inner join EbookTags et using (EbookId)    
                        inner join Tags t using (TagId)      
                where true AND con.MarcRole = "aut" AND t.Name in ('memoir') 
                order by con.SortName, e.EbookCreated desc
                limit 12                                                              
                offset 12;
+------+-------------+-------+--------+----------------+-----------+---------+-------------------------+------+--------------------------------------------------------+
| id   | select_type | table | type   | possible_keys  | key       | key_len | ref                     | rows | Extra                                                  |
+------+-------------+-------+--------+----------------+-----------+---------+-------------------------+------+--------------------------------------------------------+
|    1 | SIMPLE      | t     | ref    | PRIMARY,index1 | index1    | 1022    | const                   | 1    | Using index condition; Using temporary; Using filesort |
|    1 | SIMPLE      | e     | ALL    | PRIMARY        | NULL      | NULL    | NULL                    | 531  | Using join buffer (flat, BNL join)                     |
|    1 | SIMPLE      | con   | ref    | index1         | index1    | 4       | se.e.EbookId            | 1    | Using where                                            |
|    1 | SIMPLE      | et    | eq_ref | idxUnique      | idxUnique | 8       | se.e.EbookId,se.t.TagId | 1    | Using index; Distinct                                  |
+------+-------------+-------+--------+----------------+-----------+---------+-------------------------+------+--------------------------------------------------------+

Multiple tags selected, sort by author name

EXPLAIN SELECT distinct e.*                                           
                from Ebooks e                                                         
                inner join Contributors con using (EbookId)
                inner join EbookTags et using (EbookId)
                        inner join Tags t using (TagId)
                where true AND con.MarcRole = "aut" AND t.Name in ('memoir','nonfiction') 
                order by con.SortName, e.EbookCreated desc  
                limit 12 
                offset 24;
+------+-------------+-------+--------+----------------+-----------+---------+--------------+------+---------------------------------+
| id   | select_type | table | type   | possible_keys  | key       | key_len | ref          | rows | Extra                           |
+------+-------------+-------+--------+----------------+-----------+---------+--------------+------+---------------------------------+
|    1 | SIMPLE      | e     | ALL    | PRIMARY        | NULL      | NULL    | NULL         | 531  | Using temporary; Using filesort |
|    1 | SIMPLE      | et    | ref    | idxUnique      | idxUnique | 4       | se.e.EbookId | 1    | Using index                     |
|    1 | SIMPLE      | t     | eq_ref | PRIMARY,index1 | PRIMARY   | 4       | se.et.TagId  | 1    | Using where                     |
|    1 | SIMPLE      | con   | ref    | index1         | index1    | 4       | se.e.EbookId | 1    | Using where                     |
+------+-------------+-------+--------+----------------+-----------+---------+--------------+------+---------------------------------+

Everything: Multiple tags selected, sort by author name, query for "nature"

EXPLAIN SELECT distinct e.*                                            
                from Ebooks e                                                          
                inner join Contributors con using (EbookId)  
                inner join EbookTags et using (EbookId)
                        inner join Tags t using (TagId)
                where true AND con.MarcRole = "aut" AND t.Name in ('memoir','nonfiction')  AND match(e.IndexableText) against('nature') 
                order by con.SortName, e.EbookCreated desc
                limit 12                                                               
                offset 0;
+------+-------------+-------+----------+-------------------+-----------+---------+--------------+------+----------------------------------------------+
| id   | select_type | table | type     | possible_keys     | key       | key_len | ref          | rows | Extra                                        |
+------+-------------+-------+----------+-------------------+-----------+---------+--------------+------+----------------------------------------------+
|    1 | SIMPLE      | e     | fulltext | PRIMARY,idxSearch | idxSearch | 0       |              | 1    | Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | et    | ref      | idxUnique         | idxUnique | 4       | se.e.EbookId | 1    | Using index                                  |
|    1 | SIMPLE      | t     | eq_ref   | PRIMARY,index1    | PRIMARY   | 4       | se.et.TagId  | 1    | Using where                                  |
|    1 | SIMPLE      | con   | ref      | index1            | index1    | 4       | se.e.EbookId | 1    | Using where                                  |
+------+-------------+-------+----------+-------------------+-----------+---------+--------------+------+----------------------------------------------+