kitaindia / django-tagging

Automatically exported from code.google.com/p/django-tagging
Other
0 stars 0 forks source link

Refactor TaggedItemManager methods to return QuerySets, without performing seperate queries to retrieve tagged object ids #167

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
The attached patch implements this for the get_intersection_by_model and
get_union_for_model methods, works fine in SQLite but it fails on Postgres
with the following type of error, which after googling, won't be fun to
work around:

    ProgrammingError: column "tests_parrot.state" must appear in the GROUP
BY clause or be used in an aggregate function

Original issue reported on code.google.com by jonathan.buchanan on 2 Nov 2008 at 9:56

Attachments:

GoogleCodeExporter commented 8 years ago
This patch is absolutely necessary when you reach datasets of significant size, 
I have tags with as many as 10K 
items, including all those ids doesn't help the query planner.

I have reworked this to not use custom queries and as such I believe it will 
work with any database django 
supports

Original comment by tjfonta...@gmail.com on 29 Apr 2009 at 8:24

GoogleCodeExporter commented 8 years ago
slight bug in the last patch, this is the updated version that works properly

Original comment by tjfonta...@gmail.com on 30 Apr 2009 at 8:20

Attachments:

GoogleCodeExporter commented 8 years ago
I having some problems with this path.

with a Product.tagged.with_any('nulla') I get follow SQL:

SELECT COUNT(id) FROM `products_product` WHERE `products_product`.`id` IN 
(SELECT
DISTINCT U0.`object_id` FROM `tagging_taggeditem` U0 WHERE (U0.`tag_id` IN (61) 
AND
U0.`content_type_id` = 11 ) GROUP BY U0.`object_id` ORDER BY NULL)

This query get about 42 seconds to run in my database, here is number of itens 
in
tagging tables and product table:

mysql> select count(id) from tagging_tag;
+-----------+
| count(id) |
+-----------+
|       277 |
+-----------+
1 row in set (0.00 sec)

mysql> select count(id) from tagging_taggeditem;
+-----------+
| count(id) |
+-----------+
|    137784 |
+-----------+
1 row in set (0.00 sec)

mysql> select count(id) from products_product;
+-----------+
| count(id) |
+-----------+
|     30663 |
+-----------+
1 row in set (0.00 sec)

I'm reverted path to original code and have hight speed back, so we need to fix 
path
before add in app.

Original comment by philipe.rp@gmail.com on 30 Jul 2009 at 5:00

GoogleCodeExporter commented 8 years ago
Using my patches I'm not seeing this problem, I can iterate 14K in about a 
second, and the select statement modified to match my 
setup returns instantly, and according to EXPLAIN takes 250ms, I am using 
postgresql and not mysql but I suspect the real issue here 
is a missing index or in your choice of storage engine.

SELECT COUNT(id) FROM dms_file WHERE dms_file.id IN (SELECT
DISTINCT U0.object_id FROM tagging_taggeditem U0 WHERE (U0.tag_id IN (1) AND
U0.content_type_id = 9 ) GROUP BY U0.object_id);
 count 
-------
 14653
(1 row)

dms=> select count(id) from tagging_tag;
 count 
-------
    20
(1 row)

dms=> select count(id) from tagging_taggeditem;
 count 
-------
 18357
(1 row)

dms=> select count(id) from tagging_taggeditem;
 count 
-------
 18357

dms=> select count(id) from dms_file;
 count 
-------
 17668
(1 row)

Original comment by tjfonta...@gmail.com on 30 Jul 2009 at 6:27

GoogleCodeExporter commented 8 years ago
@tjfontaine I'll check this at home on a clean project and talk here after.
Thx!

Original comment by philipe.rp@gmail.com on 30 Jul 2009 at 7:41