mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.92k stars 895 forks source link

Hitomi.la extractor improvement #1015

Open cpz3501 opened 3 years ago

cpz3501 commented 3 years ago

I am aware of [hitomi][info] This extractor only spawns other extractors and does not provide any metadata on its own. but would be nice if a few things could be improved. The root of the problem is that some galleries have two or more artists listed in Hitomi.la, which causes the following undesirable behavior.

-Creates a new folder for each multi-artist gallery If you use something like "directory": ["Hitomi", "{artist}", "{date} {title}"], you get more than 20 folders in some cases, even though it's only a single download.

-Does not create a folder if no artist is listed. Mixes up as everything is downloaded in one folder.

-If a single gallery has too many artists it conflicts with the Windows 260 character path limit. [hitomi][error] Unable to download data: OSError: [WinError 123] Die Syntax für den Dateinamen, Verzeichnisnamen oder die Datenträgerbezeichnung ist falsch: "\\\\?\\E:\\G\\x\\x\\['artist1', 'artist2', 'artist3', 'artist4', 'artist5', 'artist6', 'artist7', 'artist8', 'artist9']"

Possible solution: Use artist names from URLs. If the downloaded url is https://hitomi.la/artist/abc123-all.html then abc123 will be used for all galleries it downloaded.

ghost commented 3 years ago

{artist[100]} will put a cap on the title.

A solution to your first problem is to replace multiple artists with "Various". Another solution is to remove {artist} and extract metadata from the gallery but that's not possible with gallery-dl at the moment.

Just curious but why are you using Hitomi instead of exhentai? Hitomi is a scrapper site which gets everything from exhentai. If you're going for galleries that are no longer available on exhentai because of DMCAs, then I guess it makes sense to use Hitomi.

cpz3501 commented 3 years ago

I don't think that's still up to date, if you do the same search queries on both sides the number of results on Hitomi is usually much higher, especially for lesser known artists.

ghost commented 3 years ago

Nope, everything you see on Hitomi was originally uploaded to exhentai before any other place. A higher number of results on Hitomi doesn't mean anything because they never remove parented, replaced, forbidden and expunged galleries, and they keep DMCA'd galleries. They also re-upload fantranslations in Russian, Chinese, Spanish, Portuguese, Korean, etc, which are languages that you've most likely added to your account's exclusion list on exhentai.

cpz3501 commented 3 years ago

Well, you can check it out for yourself if you have a few unknown artists that you like. Everything you find on exhentai will also be found on hitomi, but not the other way around.

ghost commented 3 years ago

I don't know what to look for, so you'll have to name a few unknown artists on hitomi.la which can't be found on exhentai.