Closed mindrones closed 5 years ago
Hi @mindrones. By this filters (public, content, since) you are trying to send this request to Gist API:
GET: https://api.github.com/gists/public?per_page=100&since=2018-11-01T00:00:01Z
This call returns 100 items but the library iterates over the full content (30 pages for this request right now); sometimes it returns 502 (bad gateway) in page 25, sometimes in 21, 19... It seems a limitation of Gist API to avoid crawling or misuses.
Hi, eh I suspected that :/ Any suggestion on how to retrieve those gists in some other way? Ssearching by hand in the UI (https://gist.github.com/search?q=svelte) returns 232 gists, not many, but it seems to be possible, maybe they use a private API for search? Thanks!
I think GistClient is not a good solution to handle a large volume of Gists. It was developed to make easier the management of well delimited list (user owned for example). You have to keep in mind that the previous filter produces ≈30 requests (1 per page) and one request more for each gist in page (because of 'rawContent' flag), 30*100. It could rebase easily the API limits and probably you will receive a 403 ("abuse detection"). Sadly we can't avoid it.
The UI constructs the response by a private method. Maybe you can consume this endpoint (https://gist.github.com/search?q=svelte) by doing a crawler in your backend, but it doesn't seem a clean solution.
Eh, then scraping https://gist.github.com/search?q=svelte may be a one time solution indeed if we'll end up not needing to search for gists regularly.
I'll close this one, thanks for taking the time to reply! :)
Related to #1, in order to retrieve public gists containing the word "svelte", I'm using (this time with a TOKEN):
but I get this error:
Seems to fail at the 25th page. Am I doing something wrong? Thanks!