mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.76k stars 885 forks source link

[lolibooru] extractor treats 'uploader' as an 'author' ('artist' must be used instead) #1720

Closed jgrubs1 closed 2 years ago

jgrubs1 commented 2 years ago

When downloading galleries and using the "author" keyword to sort downloaded images by the folder, gallery-dl uses the uploader's name instead of the actual author's name.

The point is, on this site, the image's author is referenced as an "artist"; gallery-dl, on the other hand, extracts the name of the uploader ("author"). Could you please fix this behavior and/or add "artist" to the list of available keywords for this extractor?

This is what I am currently using:

        "lolibooru":
        {
    "directory": ["{author}"]
        },

If the keyword 'artist' is added to the extractor, I would then use:

        "lolibooru":
        {
    "directory": ["{artist}"]
        },

Example: https://lolibooru.moe/post/show/348537/4girls-animal_ears-apron-back_bow-black_neckwear-b The image was uploaded by the user "Lolibot". However, the author ("artist") of this image is "kuro shiro (kuro96siro46)", as referenced under the "Tags" section. The author can be easily identified by reading the "https://lolibooru.moe/artist/show?name=kuro_shiro_%28kuro96siro46%29" string preceding the "https://lolibooru.moe/post?tags=kuro_shiro_%28kuro96siro46%29" link. Note that all other tags aren't referenced as "/artist/".

I am not sure how gallery-dl parses tags, but I discovered at least two relevant mentions of the "author" in the HTML source.

One: <script type="text/javascript">Post.register_resp({"posts":[{"id":348537,"tags":"kemono_friends kuro_shiro_(kuro96siro46) lucky_beast_(kemono_friends)","created_at":1627239785,"creator_id":994,**"author":"Lolibot"**,"change":0, ...cut... "japari_symbol":"general","kemono_friends":"copyright",**"kuro_shiro_(kuro96siro46)":"artist"**,

Two:

<html>
<body>
<!--StartFragment-->

<h5>Tags</h5>
--
  | <ul id="tag-sidebar">
  | <li class="tag-link tag-type-artist" data-name="kuro_shiro_(kuro96siro46)" data-type="artist"><a href="/artist/show?name=kuro_shiro_%28kuro96siro46%29">?</a> <a href="/post?tags=kuro_shiro_%28kuro96siro46%29">kuro shiro (kuro96siro46)</a>

<!--EndFragment-->
</body>
</html>

Also, some posts have authors that are not identified, in which case this tag would be missing. For example: https://lolibooru.moe/post/show/345009/1girl-animal_ears-artist_request-blouse-border-bro

mikf commented 2 years ago

gallery-dl only returns the API responses for any booru. What these fields are called is up to them.

To get lists a categorized tags (tags_artist, tags_character, etc), you need to enable the tags option. You can use conditional directory formats to handle posts without artist tags:

"lolibooru":
{
    "tags": true,
    "directory": {
        "locals().get('tags_artist')": ["{tags_artist}"],
        ""                           : ["no artist"]
    }
}