This PR is essentially a follow-up to Danny's PR #221. It reinforces the reasoning behind the changes there. Most of these changes are just from black formatting, the relevant changes are lines 480:482 and 587:591 (I'll highlight these in a comment).
Basically, in #221, the "ignore and continue if combining a Public attribute" logic was added after the class_attribute for said attribute was already joined into the query (here) so queries were still joining those in. The changes in lines 480:482 addresses this, and is responsible for the sample effects I'll describe below.
This came up in the query for accessible Classifications on a particular Obj, generated here.
Currently, that query produces something like this:
SELECT classifications.id,
classifications.created_at,
classifications.modified,
classifications.classification,
classifications.taxonomy_id,
classifications.probability,
classifications.author_id,
classifications.author_name,
classifications.obj_id
FROM classifications
JOIN
(
SELECT classifications_1.id AS id
FROM classifications AS classifications_1
JOIN group_classifications AS group_classifications_1
ON classifications_1.id = group_classifications_1.classification_id
JOIN groups
ON groups.id = group_classifications_1.group_id
JOIN group_users AS group_users_1
ON groups.id = group_users_1.group_id
JOIN users
ON users.id = group_users_1.user_id
WHERE users.id = %(id_1)s) AS anon_1
ON anon_1.id = classifications.id
JOIN
(
SELECT classifications_2.id AS id
FROM classifications AS classifications_2
JOIN taxonomies
ON taxonomies.id = classifications_2.taxonomy_id
JOIN
(
SELECT taxonomies.id AS id
FROM taxonomies
JOIN group_taxonomy AS group_taxonomy_1
ON taxonomies.id = group_taxonomy_1.taxonomie_id
JOIN groups
ON groups.id = group_taxonomy_1.group_id
JOIN group_users AS group_users_2
ON groups.id = group_users_2.group_id
JOIN users
ON users.id = group_users_2.user_id
WHERE users.id = %(id_2)s) AS anon_3
ON anon_3.id = taxonomies.id
JOIN objs
ON objs.id = classifications_2.obj_id) AS anon_2
ON anon_2.id = classifications.id
WHERE classifications.obj_id = %(obj_id_1)s
Note the JOIN objs ON objs.id = classifications_2.obj_id) AS anon_2. When this is EXPLAIN'ed , you'll note a sequential scan on the entire Objs table like this:
With the changes in this PR, you now get a query like this:
SELECT classifications.id,
classifications.created_at,
classifications.modified,
classifications.classification,
classifications.taxonomy_id,
classifications.probability,
classifications.author_id,
classifications.author_name,
classifications.obj_id
FROM classifications
JOIN
(
SELECT classifications_1.id AS id
FROM classifications AS classifications_1
JOIN group_classifications AS group_classifications_1
ON classifications_1.id = group_classifications_1.classification_id
JOIN groups
ON groups.id = group_classifications_1.group_id
JOIN group_users AS group_users_1
ON groups.id = group_users_1.group_id
JOIN users
ON users.id = group_users_1.user_id
WHERE users.id = %(id_1)s) AS anon_1
ON anon_1.id = classifications.id
JOIN
(
SELECT classifications_2.id AS id
FROM classifications AS classifications_2
JOIN taxonomies
ON taxonomies.id = classifications_2.taxonomy_id
JOIN
(
SELECT taxonomies.id AS id
FROM taxonomies
JOIN group_taxonomy AS group_taxonomy_1
ON taxonomies.id = group_taxonomy_1.taxonomie_id
JOIN groups
ON groups.id = group_taxonomy_1.group_id
JOIN group_users AS group_users_2
ON groups.id = group_users_2.group_id
JOIN users
ON users.id = group_users_2.user_id
WHERE users.id = %(id_2)s) AS anon_3
ON anon_3.id = taxonomies.id) AS anon_2
ON anon_2.id = classifications.id
WHERE classifications.obj_id = %(obj_id_1)s
On the same sample query on production that I posted the EXPLAIN screenshot above, this new query gives:
Note the total time of ~4ms instead of 1100ms
The changes in lines 587:591 is more of a preemptive optimization that I don't think has popped up yet. It applies the same sort of "ignore Public attributes" to ComposedAccessControl logic.
This PR is essentially a follow-up to Danny's PR #221. It reinforces the reasoning behind the changes there. Most of these changes are just from black formatting, the relevant changes are lines 480:482 and 587:591 (I'll highlight these in a comment).
Basically, in #221, the "ignore and continue if combining a
Public
attribute" logic was added after theclass_attribute
for said attribute was already joined into the query (here) so queries were still joining those in. The changes in lines 480:482 addresses this, and is responsible for the sample effects I'll describe below.This came up in the query for accessible Classifications on a particular Obj, generated here. Currently, that query produces something like this:
Note the![Screenshot 2021-04-14 155538](https://user-images.githubusercontent.com/17696889/114772343-aca41d80-9d3b-11eb-9d9f-c3182730ada3.png)
JOIN objs ON objs.id = classifications_2.obj_id) AS anon_2
. When this isEXPLAIN
'ed , you'll note a sequential scan on the entireObjs
table like this:With the changes in this PR, you now get a query like this:
On the same sample query on production that I posted the
Note the total time of ~4ms instead of 1100ms
EXPLAIN
screenshot above, this new query gives:The changes in lines 587:591 is more of a preemptive optimization that I don't think has popped up yet. It applies the same sort of "ignore Public attributes" to
ComposedAccessControl
logic.