mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.96k stars 975 forks source link

[Exhentai] Detecting expunged galleries #3345

Open a84r7a3rga76fg opened 1 year ago

a84r7a3rga76fg commented 1 year ago

If someone knows how to do it, please share. I've got a text file source.txt with links, each link is separated by one line each, and I'm wondering if it's possible to use gallery-dl to go through each link in source.txt to see which one has been expunged and then write the expunged link to a separate text file expunged.txt while also removing the link that's been checked or cut and paste it to another text file checked.txt.

mikf commented 1 year ago

Not entirely possible and gallery-dl isn't the best tool for this (writing your own script would be better), but you can, for example, split source.txt between expunged or not:

gallery-dl -i source.txt -o metadata=1 -o source=hitomi --chapter-filter "(f := open('expunged.txt' if expunged else 'checked.txt', 'a')).write(f'https://exhentai.org/g/{gid}/{token}/\n') and f.close()"
a84r7a3rga76fg commented 1 year ago

I think it's working, there are expunged galleries (also unavailable galleries, how do I skip those?) but there's no checked.txt and I'm getting this error for every non-expunged available gallery [exhentai][error] FilterError: Evaluating filter expression failed (OSError: [Errno 22] Invalid argument: 'C:\x01\logs\\checked.txt')

afterdelight commented 1 year ago

have you create checked.txt manually inside the folder?

a84r7a3rga76fg commented 1 year ago

I did

mikf commented 1 year ago

C:\x01\logs\\checked.txt

That's some weird path, having a \x01 in it and all. Just use forward slashes so you don't accidentally cause some unwanted backward-slash-escape to happen: C:/logs/checked.txt

a84r7a3rga76fg commented 1 year ago

That worked. Doesn't this do what I sorta wanted since checked links are added to checked.txt and expunged links are added to expunged.txt?

I got banned after 1010~ links and I'm not sure what the delay should be for HTTP requests, I've heard it's anywhere from 30 seconds to 2 minutes, and having a delay that's longer than 10 seconds will take forever with all of the links that I want it to check.

Exhentai's wiki page says you can do 25 entries per API request and do 4-5 API requests every 5 seconds, can gallery-dl do API requests? I assume in this context one entry means one gallery.

afterdelight commented 1 year ago

there is no other way besides using delay enable verbose when running your command to see http requests info how many titles do you want to download?

mikf commented 1 year ago

Exhentai's wiki page says you can do 25 entries per API request and do 4-5 API requests every 5 seconds, can gallery-dl do API requests? I assume in this context one entry means one gallery.

That's why I said writing your own script is better. gallery-dl does use the API, but it only requests info for 1 exh gallery at a time and that can't be changed.

The delay between requests is 5 seconds by default and depends on your sleep-request settings.

a84r7a3rga76fg commented 1 year ago

Any chance gallery-dl will receive an update to use the API because I'd like to request that feature, if not, then you can close this.

a84r7a3rga76fg commented 1 year ago

@mikf Is it possible to split source.txt between category i.e. Image Set and other categories?

Changing expunged to image set, 'image set' and "image set" did nothing