mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.4k stars 930 forks source link

[FR] New exec.event based on HTTP error code #1979

Open God-damnit-all opened 2 years ago

God-damnit-all commented 2 years ago

Ex: httperror:404 (or just httperror for any HTTP error) This would be an event that only exec.event would be able to use (it wouldn't make sense for metadata.event).

One application of this would be for advanced logging, appending problematic data urls to their own logfiles, including the post urls they belong to. This would also be good for having gallery-dl retry certain ones again later separately, for instance on Twitter, a post that gives a 403 for its media might not give that error again later due to the idiosyncrasies of its CDNs.

But what really makes me want a feature like this is for kemonoparty's flagging system. Broken links are unfortunately common due to some growing pains the service experienced with its importer. When the extractor hits a 404, I'd run a curl command that would send the POST request necessary to flag the entry.

Doing this at the end of my script is technically doable through parsing the log at the end, but it would be sending the POST requests all at once rather than as they come up. (I could also have a separate script monitoring the logfile, but that's a level of contrivance I don't want to deal with.)

mikf commented 2 years ago

More event hooks would definitely be useful.

Maybe one (or several) for logging messages, and one for each HTTP request with enough metadata to determine if it succeeded or not.

zejjnt commented 3 months ago

I would like to bump this, I scrape some Instagram stuff and when getting a 400, 401 or 403 I would like to stop gallery-dl to avoid triggering additional antibot-checks.