catalyst / moodle-tool_objectfs

Object file storage system for Moodle
https://moodle.org/plugins/tool_objectfs
84 stars 71 forks source link

'Missing from filedir and external storage files' test script not practically usable #616

Open timhunt opened 2 months ago

timhunt commented 2 months ago

On our system (in AWS) admin/tool/objectfs/missing_files.php is not practically usable. It just sits there for a bit then gives a gateway timeout.

This is a pity, becuase we would like to have a check like this.

I was wondering if this should be re-developed as either a CLI script or an ad-hoc task?

However, before diving in and writing some code, I thought I would create this issue, seeking guidance on what stort of alternative script would be the right way to go. (Or, is there some trick to getting the current script to run to completion in practice? Thanks.

twhateley commented 2 months ago

S3 Object Inventory could potentially be useful if the underlying issue is with rate limiting, or other capacity issues with lots of API requests.

Amazon S3 Inventory provides comma-separated values (CSV), Apache optimized row columnar (ORC) or Apache Parquet output files that list your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or objects with a shared prefix (that is, objects that have names that begin with a common string). If you set up a weekly inventory, a report is generated every Sunday (UTC time zone) after the initial report. For information about Amazon S3 Inventory pricing, see Amazon S3 pricing.

brendanheywood commented 2 months ago

@timhunt all of the calculation is already done in tasks behind the scenes and this report is mostly just pulling out of those tables. I suspect we can reverse the left join and get some better performance here:

https://github.com/catalyst/moodle-tool_objectfs/blob/MOODLE_310_STABLE/classes/local/table/files_table.php#L44-L47

timhunt commented 2 months ago

Ah! OK. Perhpas my first step should be to EXPLAIN the query on our DB, and/or profile to see if that reaveals why the script is timing out. I will let you know if I find anything relevant.

Thanks for the speedy response.