Open grossir opened 2 weeks ago
We haven't used MV's before, so I have questions:
I think I asked before, but what do we get from this that a query doesn't get us? I.e., why bother materializing it?
How do we run the periodic refresh?
To do this kind of queries we have 3 options
SELECT * FROM view_name
and it will re-execute the query.In the case of this query, I don't think a new index would help. I think the slowness comes from joining big tables. Some query optimization may help reduce it more. The raw version takes 10 minutes, prefiltering the docket table by courts that have scrapers takes it down to 3 minutes
Why bother materializing it?
How to refresh it
I expect to check this table a couple times a week. So we could refresh this every 2 or 3 days. We could use a plain cron job, the command is simple REFRESH MATERIALIZED VIEW <name>
Or, we could implement the refresh logic as an async call when accessing the admin page, and send a django admin message (those banners when editing an object) to the user advising to refresh the page in a few minutes
Hm, in the past we had a script that would send an email with this kind of information, and it ran on a cron job, so I guess this is basically the same thing, but instead of storing the results of the script in an email, this stores the results in the materialized view. Seems fine, I suppose. :)
Concept for https://github.com/freelawproject/courtlistener/issues/3950
For this to work, the materialized view must be created directly on the DB by running the "query" attribute
For my local database, it looks like this
But, if if connected to the live DB, it would have these rows: