Open wagnerand opened 3 years ago
There are a couple compounding issues causing this:
distinct()
, Django wraps the full query in a subquery and counts on that. We can try to at least force it to avoid needlessly selecting the columns when it does that, locally it does seem to improve things a tiny bit.Closing because mozilla/addons-server#15401 should help, please re-open if that's no the case once it hits prod.
Unfortunately, the queries are still slow, I tried a few times and response times were between 15 and 20 seconds.
The rather dramatic perf hit can be seen on Grafana: https://earthangel-b40313e5.influxcloud.net/d/sGEtfYcZk/amo-reviewer-tools-performance-and-monitoring?viewPanel=8&orgId=1&from=now-90d&to=now
The problematic query in all its glory:
SELECT DISTINCT `addons`.`created`,
`addons`.`modified`,
`addons`.`id`,
`addons`.`guid`,
`addons`.`slug`,
`addons`.`name`,
`addons`.`defaultlocale`,
`addons`.`addontype_id`,
`addons`.`status`,
`addons`.`icontype`,
`addons`.`icon_hash`,
`addons`.`homepage`,
`addons`.`supportemail`,
`addons`.`supporturl`,
`addons`.`description`,
`addons`.`summary`,
`addons`.`developercomments`,
`addons`.`eula`,
`addons`.`privacypolicy`,
`addons`.`averagerating`,
`addons`.`bayesianrating`,
`addons`.`totalreviews`,
`addons`.`textreviewscount`,
`addons`.`weeklydownloads`,
`addons`.`hotness`,
`addons`.`average_daily_users`,
`addons`.`last_updated`,
`addons`.`inactive`,
`addons`.`target_locale`,
`addons`.`contributions`,
`addons`.`current_version`,
`addons`.`experimental`,
`addons`.`reputation`,
`addons`.`requires_payment`,
`versions`.`created`,
`versions`.`modified`,
`versions`.`id`,
`versions`.`addon_id`,
`versions`.`license_id`,
`versions`.`releasenotes`,
`versions`.`approvalnotes`,
`versions`.`version`,
`versions`.`nomination`,
`versions`.`reviewed`,
`versions`.`deleted`,
`versions`.`source`,
`versions`.`channel`,
`versions`.`git_hash`,
`versions`.`needs_human_review`,
`editors_autoapprovalsummary`.`created`,
`editors_autoapprovalsummary`.`modified`,
`editors_autoapprovalsummary`.`version_id`,
`editors_autoapprovalsummary`.`is_locked`,
`editors_autoapprovalsummary`.`has_auto_approval_disabled`,
`editors_autoapprovalsummary`.`is_promoted_prereview`,
`editors_autoapprovalsummary`.`should_be_delayed`,
`editors_autoapprovalsummary`.`is_blocked`,
`editors_autoapprovalsummary`.`verdict`,
`editors_autoapprovalsummary`.`weight`,
`editors_autoapprovalsummary`.`weight_info`,
`editors_autoapprovalsummary`.`confirmed`,
`addons_addonreviewerflags`.`created`,
`addons_addonreviewerflags`.`modified`,
`addons_addonreviewerflags`.`addon_id`,
`addons_addonreviewerflags`.`needs_admin_code_review`,
`addons_addonreviewerflags`.`needs_admin_content_review`,
`addons_addonreviewerflags`.`needs_admin_theme_review`,
`addons_addonreviewerflags`.`auto_approval_disabled`,
`addons_addonreviewerflags`.`auto_approval_disabled_until_next_approval`,
`addons_addonreviewerflags`.`auto_approval_delayed_until`,
`addons_addonreviewerflags`.`notified_about_auto_approval_delay`,
`addons_addonreviewerflags`.`notified_about_expiring_delayed_rejections`,
`addons_addonapprovalscounter`.`created`,
`addons_addonapprovalscounter`.`modified`,
`addons_addonapprovalscounter`.`addon_id`,
`addons_addonapprovalscounter`.`counter`,
`addons_addonapprovalscounter`.`last_human_review`,
`addons_addonapprovalscounter`.`last_content_review`
FROM `addons`
LEFT OUTER JOIN `versions`
ON (`addons`.`current_version` = `versions`.`id`)
LEFT OUTER JOIN `versions_versionreviewerflags`
ON (`versions`.`id` = `versions_versionreviewerflags`.`version_id`)
INNER JOIN `versions` T4
ON (`addons`.`id` = T4.`addon_id`)
INNER JOIN `files`
ON (T4.`id` = `files`.`version_id`)
LEFT OUTER JOIN `versions_versionreviewerflags` T6
ON (T4.`id` = T6.`version_id`)
LEFT OUTER JOIN `editors_autoapprovalsummary`
ON (`versions`.`id` = `editors_autoapprovalsummary`.`version_id`)
LEFT OUTER JOIN `addons_addonreviewerflags`
ON (`addons`.`id` = `addons_addonreviewerflags`.`addon_id`)
LEFT OUTER JOIN `addons_addonapprovalscounter`
ON (`addons`.`id` = `addons_addonapprovalscounter`.`addon_id`)
WHERE (('en-us'='en-us') AND NOT (`addons`.`status` = 11) AND `versions_versionreviewerflags`.`pending_rejection` IS NULL AND `addons`.`status` IN (4, 3, 0) AND `files`.`status` IN (4, 1) AND ((T4.`channel` = 1 AND T6.`needs_human_review_by_mad` = 1) OR `versions_versionreviewerflags`.`needs_human_review_by_mad` = 1))
ORDER BY `addons`.`created` ASC
It doesn't feel right that we need such a complicated query for something that needs to be done routinely, tbh. I can understand the complexity if it's for some one-time query, but for routine stuff, maybe it's time to re-think the data model...
The query is not actually that complex. It can be simplified into:
SELECT DISTINCT `addons`.`created`,
`addons`.`modified`,
`addons`.`id`
FROM `addons`
LEFT OUTER JOIN `versions`
ON (`addons`.`current_version` = `versions`.`id`)
LEFT OUTER JOIN `versions_versionreviewerflags`
ON (`versions`.`id` = `versions_versionreviewerflags`.`version_id`)
INNER JOIN `versions` T4
ON (`addons`.`id` = T4.`addon_id`)
INNER JOIN `files`
ON (T4.`id` = `files`.`version_id`)
LEFT OUTER JOIN `versions_versionreviewerflags` T6
ON (T4.`id` = T6.`version_id`)
WHERE (`versions_versionreviewerflags`.`pending_rejection` IS NULL AND `addons`.`status` IN (4, 3, 0) AND `files`.`status` IN (4, 1) AND ((T4.`channel` = 1 AND T6.`needs_human_review_by_mad` = 1) OR `versions_versionreviewerflags`.`needs_human_review_by_mad` = 1))
ORDER BY `addons`.`created` ASC LIMIT 10;
And the perf problem remains. The biggest issue with it is that we are trying to deal with unlisted and listed versions in a single query and they have completely different requirements.
We discussed this with Andreas today and I need to test that out with a production database snapshot but my idea would be:
https://github.com/mozilla/addons/issues/7936 will hide the queue counts and tweak the layout. Keeping this one opened to separate the Flagged by scanners / Flagged for human review queues into separate listed/unlisted queues.
Actually I've filed https://github.com/mozilla/addons/issues/7937 for the separation of the queues, and we'll keep this one opened until all the perf issues are gone.
As expected: now that we've removed the counts most of the queues are fast, the dashboard and flagged for human review queues are still slow.
mozilla/addons-server#15767 should address dashboard performance by removing scanner queues counts.
Closing as we have done as much as we could in the scope of this issue. Further performance improvements would need more invasive measures, like a mysql upgrade (which is already planned) or a complete architectural, visual and/or workflow redesign of how add-ons are reviewed.
This has regressed a lot, page load takes 35-45s now for the "Flagged for human review" queue.
Note that the more results there are, the slower that query is going to be, so technically it might not be a regression in terms of code changes. It would be interesting to test https://github.com/mozilla/addons/issues/1835 with and without the DISTINCT
to see if removing it helps: now that we have done a bunch of work to try to ensure there is only one file per version at all times, maybe we can get away with removing it. (https://github.com/mozilla/addons/issues/8181 would guarantee it)
Sadly, I spoke too soon: the DISTINCT
here is on addons
table so we can't remove it. It's here because there are multiple versions that could cause the add-on to be in the queue, so it needs to stay.
Reviewer Tools Performance for the past 10 months:
You can see on this graph the initial loss in performance caused by mozilla/addons-server#15224 and mozilla/addons-server#15264 towards mid-August (which made the queue more correct) and the slight improvement mozilla/addons-server#15401, mozilla/addons#7936 and mozilla/addons#7937 made towards the end of September. It's been stable around 30-40s since.
https://github.com/mozilla/addons/issues/8814 should hopefully make the scanners and mad queues slightly faster. There is a little more that could be gained by avoiding computing the count of the number of results and do an infinite-like pagination instead.
Ultimately though, we'll likely soon reach a ceiling of what can be done with the current approach - the "queues" are querying a lot of different tables and MySQL needs to merge the results together, and that's an expensive operation, especially with ordering added into the mix. More de-normalization with dedicated tables for the queues would avoid this entirely.
Old Jira Ticket: https://mozilla-hub.atlassian.net/browse/ADDSRV-46
The filters for the mad queue we added in mozilla/addons-server#15224 and mozilla/addons-server#15264 are extremely slow. Since they are also used to display the count in queue header links, every queue page is affected, as well as submitting a listed review (which redirects to a queue page). Queue pages take 20s or longer to load.
┆Issue is synchronized with this Jira Task