maxtepkeev / architect

A set of tools which enhances ORMs written in Python with more features
Other
391 stars 57 forks source link

How to SELECT from partitions with ORM? #34

Closed coffebar closed 7 years ago

coffebar commented 7 years ago

I have a Django model like this:

@architect.install('partition', type='range', subtype='integer', constraint='100', column='search_id')
class SearchResult(models.Model):
    search = models.ForeignKey(Search, on_delete=models.CASCADE)
    ...

When I am selecting with ORM filtering by partition column

SearchResult.objects.filter(search_id=3502).only('id').first()

Django selects from master table:

(0.005) SELECT "app_searchresult"."id" FROM "app_searchresult" WHERE "app_searchresultapp_searchresult"."search_id" = 3502 ORDER BY "app_searchresult"."id" ASC LIMIT 1; args=(3502,)

How I can select something with ORM from tables caled like "app_searchresult_3501_3600"? I believe that I don't need to know a table name for this.

maxtepkeev commented 7 years ago

With table partitioning you just select from the master table as you do without partitioning, database takes care about which partition to query for you. That's the whole point of partitioning. You don't need to know which partition to query for a specific record.

coffebar commented 7 years ago

So when query like this: SearchResult.objects.first() Db is looking only in "app_searchresult" table? And when
SearchResult.objects.filter(search_id__in=[1, 3502]).first()) it's looking in join "app_searchresult_3501_3600" and "app_searchresult_1_100" ? or in "app_searchresult" table again?

And when SearchResult.objects.filter(search_id__in=[3501, 3502]).first()) only in "app_searchresult_3501_3600" ?

maxtepkeev commented 7 years ago

No, it's more complicated than that and it depends on how Django constructs underlying query. This is a big topic, if you want to know low-level details please have a look at the PostgreSQL docs for partitioning. Also you're using the foreign key for partitioning, that is a bad idea with PostgreSQL, see https://github.com/maxtepkeev/architect/issues/27#issuecomment-201199472 for details.

coffebar commented 7 years ago

thanks


if interesting, "explain" for

SearchResult.objects.filter(search_id__in=[1, 3502]).first())

Limit  (cost=0.61..14.05 rows=1 width=4)
   ->  Merge Append  (cost=0.61..3523.75 rows=262 width=4)
         Sort Key: app_searchresult.id
         ->  Index Scan using app_searchresult_pkey on app_searchresult  (cost=0.29..3417.38 rows=256 width=4)
               Filter: (search_id = ANY ('{21,3502,0}'::integer[]))
         ->  Index Scan using app_searchresult_1_100_pkey on app_searchresult_1_100  (cost=0.14..50.77 rows=3 width=4)
               Filter: (search_id = ANY ('{21,3502,0}'::integer[]))
         ->  Index Scan using app_searchresult_3501_3600_pkey on app_searchresult_3501_3600  (cost=0.14..50.77 rows=3 width=4)
               Filter: (search_id = ANY ('{21,3502,0}'::integer[]))
(9 rows)

scan tables: app_searchresult, app_searchresult_1_100 and app_searchresult_3501_3600

Query without "Where":

EXPLAIN SELECT id FROM "app_searchresult" ORDER BY id LIMIT 1;

Limit  (cost=0.77..0.83 rows=1 width=4)
   ->  Merge Append  (cost=0.77..5047.45 rows=78812 width=4)
         Sort Key: app_searchresult.id
         ->  Index Only Scan using app_searchresult_pkey on app_searchresult  (cost=0.29..3123.41 rows=78392 width=4)
         ->  Index Only Scan using app_searchresult_2501_2600_pkey on app_searchresult_2501_2600  (cost=0.14..50.25 rows=140 width=4)
         ->  Index Only Scan using app_searchresult_1_100_pkey on app_searchresult_1_100  (cost=0.14..50.25 rows=140 width=4)
         ->  Index Only Scan using app_searchresult_3501_3600_pkey on app_searchresult_3501_3600  (cost=0.14..50.25 rows=140 width=4)
(7 rows)

Scan all created tables in this partitioning