scribe-org / Scribe-Data

Wikidata, Wiktionary and Wikipedia language data extraction
GNU General Public License v3.0
23 stars 25 forks source link

Deleted: Convert all query processes to use `LIMIT` and `OFFSET` #130

Closed andrewtavis closed 3 months ago

andrewtavis commented 5 months ago

Terms

Description

Related to the work that's happening in #124, we made the decision in the last dev sync that we'll be doing a new method of breaking down queries that are too large to return information because of time out restrictions. The first version of this will be implemented in #124, and then other queries should further be changed to run on the new method where all queries will have a LIMIT and OFFSET set within the query that can then be programmatically changed. The method for this will be:

Note: In the sync I was talking that we'll also switch over all of the _1, _2, etc queries to also work like this. This may not be possible, as if memory serves me part of this was also that Wikidata has a character limit to what you can pass to it (this is why all the queries are written with very short abbreviations). We can test this and see if we can convert these queries into a single common one as well 😊

Contribution

Happy to work on this with people as far as planning the scope of the work and helping with implementation! 🚀

andrewtavis commented 3 months ago

Note that we should not be explicitly using LIMIT and OFFSET in the queries, but rather programmatically adding these if they're needed :)

andrewtavis commented 3 months ago

Deleting this as the changes in 7bfa5bc and other commits around this time have made this issue not necessary 😊