Closed agjohnson closed 1 year ago
There seems to be something scraping the RTD project build list for this project, which is resulting in Pagerduty being upset.
I ended up blocking the request at CF, as I didn't want to muck around with database operations clearing this many builds after hours. Next up, we should run some clean up queries, and finally return to some automated clean up or something.
import datetime
dt = datetime.datetime.now() - datetime.timedelta(days=30)
proj = Project.objects.get(slug="time-test")
proj.builds.filter(date__lt=dt).count()
501886
I took a look at the execution time for time-test
and docs
projects when counting the builds.
In [3]: Project.objects.get(slug='docs').builds.count()
SELECT "projects_project"."id",
"projects_project"."pub_date",
"projects_project"."modified_date",
"projects_project"."name",
"projects_project"."slug",
"projects_project"."description",
"projects_project"."repo",
"projects_project"."repo_type",
"projects_project"."project_url",
"projects_project"."canonical_url",
"projects_project"."single_version",
"projects_project"."default_version",
"projects_project"."default_branch",
"projects_project"."requirements_file",
"projects_project"."documentation_type",
"projects_project"."urlconf",
"projects_project"."external_builds_enabled",
"projects_project"."external_builds_privacy_level",
"projects_project"."cdn_enabled",
"projects_project"."analytics_code",
"projects_project"."analytics_disabled",
"projects_project"."container_image",
"projects_project"."container_mem_limit",
"projects_project"."container_time_limit",
"projects_project"."build_queue",
"projects_project"."max_concurrent_builds",
"projects_project"."allow_promos", "projects_proje
Execution time: 0.000889s [Database: default]
SELECT COUNT(*) AS "__count"
FROM "builds_build"
WHERE "builds_build"."project_id" = 74581
Execution time: 0.002788s [Database: default]
Out[3]: 9047
In [4]: Project.objects.get(slug='time-test').builds.count()
SELECT "projects_project"."id",
"projects_project"."pub_date",
"projects_project"."modified_date",
"projects_project"."name",
"projects_project"."slug",
"projects_project"."description",
"projects_project"."repo",
"projects_project"."repo_type",
"projects_project"."project_url",
"projects_project"."canonical_url",
"projects_project"."single_version",
"projects_project"."default_version",
"projects_project"."default_branch",
"projects_project"."requirements_file",
"projects_project"."documentation_type",
"projects_project"."urlconf",
"projects_project"."external_builds_enabled",
"projects_project"."external_builds_privacy_level",
"projects_project"."cdn_enabled",
"projects_project"."analytics_code",
"projects_project"."analytics_disabled",
"projects_project"."container_image",
"projects_project"."container_mem_limit",
"projects_project"."container_time_limit",
"projects_project"."build_queue",
"projects_project"."max_concurrent_builds",
"projects_project"."allow_promos", "projects_proje
Execution time: 0.000889s [Database: default]
SELECT COUNT(*) AS "__count"
FROM "builds_build"
WHERE "builds_build"."project_id" = 487639
Execution time: 0.040931s [Database: default]
Out[4]: 519390
Querying time-test
is 14x times more expensive. It's still pretty fast, tho, but I guess this is not true when the database is a little more overload.
In [5]: 0.040931 / 0.002788
Out[5]: 14.681133428981349
Yeah, it's not clear if this is all just the number of objects queried, or if there is also some baseline database performance issues causing this query to be slow.
Here's a NR transaction that shows the slow query -- showing ~7s querying builds_build twice: https://onenr.io/0ZQWAyGy9RW
Does it makes sense to enable django-debug-toolbar on web-extra? That way we can immediately check/verify these things. We can access it via the VPN in the same way we access the Django Admin: https://admin.ops.readthedocs.org:10000/projects/test-builds/builds/
I went ahead and created this PR https://github.com/readthedocs/readthedocs.org/pull/9641. It's going to be useful for other purposes as well.
I've cleaned up builds > 30d old. We're down 570k -> 17k for now. It might be worth discussing some project flag or setting to remove old builds, but for now can just return to this when it's a problem.
Might be worth deleting builds after a week or so, using an automated task or something. We have "a few" builds