mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.77k stars 889 forks source link

Plurk - More granular control #1111

Open musjj opened 3 years ago

musjj commented 3 years ago

Plurk extractor lacks the granular control that other Twitter-like extractors provides. Some options that I think would be useful:

If anyone thinks something is missing here, please do tell in the comments.

nisehime commented 3 years ago

I'd also note that plurk URLs are processed by DirectlinkExtractor, so all of them (by default) are in the directlink folder and not grouped by users. Which also makes it harder to filter other non-plurk links like twitter etc, even with whitelist/blacklist.

UPD: Actually nvm, I understood why it is like that. Still, more granular control is indeed needed.

Hrxn commented 3 years ago

Your update means that it is working as expected for you now? It should, based on what you are describing here. Try setting the category-transfer option inside the plurk extractor options to true

nisehime commented 3 years ago

Your update means that it is working as expected for you now?

No, and I'm not the author of this issue. Filters described by the author would be really helpful. Also, I guess I rethought my upd. I think handling native plurk links with plurk extractor is not that bad idea.

musjj commented 3 years ago

Yes, the bigger issue with this is that the Plurk extractor currently does not provide any filename or directory keywords. So if you want to customize your filename or extract metadata, it's not possible right now.

nisehime commented 3 years ago

@mikf By the way, I tried to implement temporary solution to the directory issue, and this is what I figured out: This is the config:

{
    "extractor": {
        "plurk": {
            "comments": true,
            "directory": ["plurk_test"],
            "whitelist": ["directlink"],
            "parent-metadata": true,
            "filename": "{owner_id} {plurk_id}",
            "category-transfer": true
        }
    }
}

As was said, plurk's images are transfered to directlink extractor for some reasons, so by default the result will be:

F:\gallery-dl>gallery-dl --chapter-range -2 https://www.plurk.com/BOW99
# .\gallery-dl\directlink\images.plurk.com__1LjmLKh7htkja9vB6z7EB3.png
# .\gallery-dl\directlink\images.plurk.com__286ILNC4UKy8rZPxm5Pi7O.png

Unless I misunderstand how category-transfer works, with the above config expected result should be:

F:\gallery-dl>gallery-dl --chapter-range -2 https://www.plurk.com/BOW99
# .\gallery-dl\plurk_test\{owner_id} {plurk_id}.png
# .\gallery-dl\plurk_test\{owner_id} {plurk_id}.png

But instead it looks like this:

F:\gallery-dl>gallery-dl --chapter-range -2 https://www.plurk.com/BOW99
# .\gallery-dl\plurk\images.plurk.com__1LjmLKh7htkja9vB6z7EB3.png
# .\gallery-dl\plurk\images.plurk.com__286ILNC4UKy8rZPxm5Pi7O.png

So, the directory has changed, but it is the default plurk's directory name, not from the config. Meanwhile filename hasn't changed at all and has remained the default for directlink extractor.

Is it intended that category-transfer transfers extractor's default options to it's child? And why in this case only default name for directory was passed, but not filename?

Also, setting parent-directory to true instead of category-transfer doesn't seem to work. It just downloads everything to directlink's default folder again.

And other question: why is plurk extractor considered manga chapter extractor?

nisehime commented 3 years ago

@mikf still doesn't seem to work properly. image-filter set for the plurk extractor is not passed to the directlink extractor.