Questions, Feedback and Suggestions #3

mikf commented 5 years ago

Continuation of the old issue as a central place for any sort of question or suggestion not deserving their own separate issue. There is also https://gitter.im/gallery-dl/main if that seems more appropriate.

Links to older issues: #11, #74

rachmadaniHaryono commented 5 years ago

simple snippet to turn gallery-dl into api

from types import SimpleNamespace
from unittest.mock import patch, Mock
import os

import click
from flask.cli import FlaskGroup
from flask import (
    Flask,
    jsonify,
    request,
)

from gallery_dl import main, option
from gallery_dl.job import DataJob

def get_json():
    data = None
    parser = option.build_parser()
    args = parser.parse_args()
    args.urls = request.args.getlist('url')
    if not args.urls:
        return jsonify({'error': 'No url(s)'})
    args.list_data = True

    class CustomClass:
        data = []

        def run(self):
            dj = DataJob(*self.data_job_args, **self.data_job_kwargs)
            dj.run()
            self.data.append({
                'args': self.data_job_args,
                "kwargs": self.data_job_kwargs,
                'data': dj.data
            })

        def DataJob(self, *args, **kwargs):
            self.data_job_args = args
            self.data_job_kwargs = kwargs
            retval = SimpleNamespace()
            retval.run = self.run
            return retval

    c1 = CustomClass()
    with patch('gallery_dl.option.build_parser') as m_bp, \
            patch('gallery_dl.job.DataJob', side_effect=c1.DataJob) as m_jt:
        #  m_option.return_value.parser_args.return_value = args
        m_bp.return_value.parse_args.return_value = args
        m_jt.__name__ = 'DataJob'
        main()
        data = c1.data
    return jsonify({'data': data, 'urls': args.urls})

def create_app(script_info=None):
    """create app."""
    app = Flask(__name__)
    app.add_url_rule(
        '/api/json', 'gallery_dl_json', get_json)
    return app

@click.group(cls=FlaskGroup, create_app=create_app)
def cli():
    """This is a script for application."""
    pass

if __name__ == '__main__':
    cli()

e: this could be simple when using direct DataJob to handle the urls, but i haven't check if there is anything have to be done before initialize DataJob instance

mikf commented 5 years ago

this could be simple when using direct DataJob to handle the urls, but i haven't check if there is anything have to be done before initialize DataJob instance.

You don't need to do anything before initializing any of the Job classes:

>>> from gallery_dl import job
>>> j = job.DataJob("https://imgur.com/0gybAXR")
>>> j.run()
[ ... ]

You can initialize anything logging related if you want logging output, or call config.load() and config.set(...) if you want to load a config file and set some custom options, but none of that is necessary.

DonaldTsang commented 5 years ago

@rachmadaniHaryono what does that code do?

rachmadaniHaryono commented 5 years ago

simpler api (based on above suggestion)

#!/usr/bin/env python
from types import SimpleNamespace
from unittest.mock import patch, Mock
import os

import click
from flask.cli import FlaskGroup
from flask import (
    Flask,
    jsonify,
    request,
)

from gallery_dl import main, option
from gallery_dl.job import DataJob
from gallery_dl.exception import NoExtractorError

def get_json():
    data = []
    parser = option.build_parser()
    args = parser.parse_args()
    args.urls = request.args.getlist('url')
    if not args.urls:
        return jsonify({'error': 'No url(s)'})
    args.list_data = True
    for url in args.urls:
        url_res = None
        error = None
        try:
            job = DataJob(url)
            job.run()
            url_res = job.data
        except NoExtractorError as err:
            error = err
        data_item = [url, url_res, {'error': str(error) if error else None}]
        data.append(data_item)
    return jsonify({'data': data, 'urls': args.urls})

def create_app(script_info=None):
    """create app."""
    app = Flask(__name__)
    app.add_url_rule(
        '/api/json', 'gallery_dl_json', get_json)
    return app

@click.group(cls=FlaskGroup, create_app=create_app)
def cli():
    """This is a script for application."""
    pass

if __name__ == '__main__':
    cli()

rachmadaniHaryono commented 5 years ago

gallery_dl_gug gug for hydrus (port 5013)

DonaldTsang commented 5 years ago

@rachmadaniHaryono instructions on using this GUG and combing it with Hydrus? Any pre-configurstions besides pip3 install gallery-dl ?

rachmadaniHaryono commented 5 years ago

put that on script (e.g. script.py)
import gug into hydrus
pip3 install flask gallery-dl (add --user if needed)
run python3 script.py --port 5013

DonaldTsang commented 5 years ago

@rachmadaniHaryono add that to the Wiki in https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts if you can, sounded like a really good solution. Also, why port 5013, is that port specifically used for something?

rachmadaniHaryono commented 5 years ago

Also, why port 5013, is that port specifically used for something

not a really technical reason. i just use it because the default port is used for my other program.

add that to the Wiki in CuddleBear92/Hydrus-Presets-and-Scripts if you can

i will consider it, because i'm not sure where to put that

another plan is fork (or create pr) for server command but i'm not sure if @mikf want pr for this

DonaldTsang commented 5 years ago

@rachmadaniHaryono https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/wiki Also I would like @mikf to have a look at this, since this is pretty useful. BTW, what is the speed overhead of using this over having a separate txt file like the one in https://github.com/Bionus/imgbrd-grabber/issues/1492 ?

rachmadaniHaryono commented 5 years ago

BTW, what is the speed overhead of using this over having a separate txt file like the one in Bionus/imgbrd-grabber#1492 ?

this depend on hydrus vs imgbrd-grabber download speed. from my test gallery-dl give direct link, so hydrus don't have to process the link anymore.

mikf commented 5 years ago

another plan is fork (or create pr) for server command but i'm not sure if @mikf want pr for this

I've already had something similar to this in mind (implementing a (local) server infrastructure to (remotely) send commands / queries: gallery-dl --server), so I would be quite in favor of adding functionality like this. But I'm not so happy about adding flask as a dependency, even if optional. I just generally dislike adding dependencies if they aren't absolutely necessary. I was thinking of using stuff from the http.server module in Python's standard library if possible. Also: the script you posted here should be simplified quite a bit further. For example there is no need to build an command line option parser. I'll see if I can get something to work on my own.

A few questions from me concerning Hydrus

The whole thing is written in Python, even version 3 since the last update. Isn't there a better way of coupling it with another Python module than a HTTP server? As in "is it possible to add a native "hook" to make it call another Python function"?
Is there any documentation for the request and response data formats Hydrus sends to and expects from GUG's? I've found this, but that doesn't really explain how Hydrus interacts with other things.

rachmadaniHaryono commented 5 years ago

But I'm not so happy about adding flask as a dependency, even if optional. I just generally dislike adding dependencies if they aren't absolutely necessary. I was thinking of using stuff from the http.server module in Python's standard library if possible.

this still depend on how big will this be. will it just be an api or there will be html interface for this. although an existing framework will make it easier and the plugin for the framework will let other developer create more feature they want.

of course there is more better framework than flask as example, e.g. sanic, django but i actually doubt if using the standard will be better than those.

Also: the script you posted here should be simplified quite a bit further. For example there is no need to build an command line option parser.

that is modified version from flask cli example. flask can do that simpler but it require to set up variable environment which add another command

The whole thing is written in Python, even version 3 since the last update. Isn't there a better way of coupling it with another Python module than a HTTP server? As in "is it possible to add a native "hook" to make it call another Python function"?

hydrus dev is planned to make api for this on the next milestone. there is also other hydrus user which make unofficial api but he didn't make one for download yet. so either wait for it or use existing hydrus parser

Is there any documentation for the request and response data formats Hydrus sends to and expects from GUG's? I've found this, but that doesn't really explain how Hydrus interacts with other things.

hydrus expect either html and json and try to extract data based on the parser the user made/import. i make this one for html but it maybe changed on future version https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/guide/create_parser_furaffinity.md .

if someone want to make one, they can try made api similar to 4chan api,copy the structure and use modified parser from existing 4chan api.

my best recommendation is to try hydrus parser directly and see what option is there. ask hydrus discord channel if anything is unclear

wankio commented 5 years ago

can gallery-dl support weibo ? i found this https://github.com/nondanee/weiboPicDownloader but it take too long to scan and dont have ability to skip downloaded files

mikf commented 5 years ago

@rachmadaniHaryono I opened a new branch for API server related stuff. The first commit there implements the same functionality as your script, but without external dependencies. Go take a look at it if you want.

And when I said your script "should be simplified ... further" I didn't mean it should use less lines of code, but less resources in term of CPU and memory. Python might not be the right language to use when caring about things like that, but there is still no need to call functions that effectively do nothing - command-line argument parsing for example.

rachmadaniHaryono commented 5 years ago

will it be only api or will there will be html interface @mikf?

e: i will comment the code on the commit

mikf commented 5 years ago

I don't think there should be an HTML interface directly inside of gallery-dl. I would prefer it to have a separate front-end (HTML or whatever) communicating with the API back-end that's baked into gallery-dl itself. It is a more general approach and would allow for any programing language and framework to more easily interact with gallery-dl, not just Python.

rachmadaniHaryono commented 5 years ago

gallery_dl_gug

based on https://github.com/mikf/gallery-dl/commit/8662e72bdd80f0158c5d73cccc1d1777f5fbaf33
album.title is now parsed as album tag
source url and download url are minimum 2 character (fix host:port/api/json/1 error)
description is not None or none

still on port 5013

wankio commented 5 years ago

About twitter extractor, we have limited request depend on how many tweets user had right ? if user have over 2k+ media, 99% it can't download full media

mikf commented 5 years ago

@wankio The Twitter extractor gets the same tweets you would get by visiting a timeline in your browser and scrolling down until no more tweets get dynamically loaded. I don't know how many tweets you can access like that, but Twitter's public API has a similar restriction::

https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html

This method can only return up to 3,200 of a user's most recent Tweets. Native retweets of other statuses by the user is included in this total, regardless of whether include_rts is set to false when requesting this resource.

You could try ripme. It uses the public API instead of a "hidden", browser-only API like gallery-dl. Maybe you can get more results with that.

wankio commented 5 years ago

but if i remember, ripme rip all tweet/retweet not just user tweet

schleifen commented 5 years ago

For some reason the login with OAuth and App Garden tokens or the -u/-p commands doesn't work with flickr which makes images that require a login to view them not downloadable. But otherwise amazing tool, thank you so much!

wankio commented 5 years ago

today when i'm checking e-hentai/exhentai, it just stucked forever. maybe my ISP is the problem because i can't access e-hentai but exhentai still ok. So i think Oauth should help, using cookies instead id+password to bypass

ghost commented 5 years ago

is there a way to download files directly in a specified folder instread of subfolders? for exemple for the picture to be downloaded in F:\Downloaded\ i tried using gallery-dl -d "F:\Downloaded\" https://imgur.com/a/xcEl2WW but instead they get downloaded to F:\Downloaded\imgur\xcEl2WW - Inklings is there an argument i could add to the command to fix that?

mikf commented 5 years ago

@Mattlau04 Short answer: set extractor.directory to an empty string: -o directory=""

Long answer: The path for downloaded files is build from three components:

a static base-directory (that's what you set with -d/--dest)
directory: a list of format strings; one for each path segment
filename: another format string

You can configure all three of them to fit your needs in your config file, but specifying a format string on the command-line can be rather cumbersome and has therefore no extra command-line argument.
You can however use -o/--option to set any option value and removing the dynamic directory part should do what you want.

ghost commented 5 years ago

thanks a lot for the help!

ghost commented 5 years ago

Huh sorry to ask so much stuff in so little time, but in a batchfile, i have this command : gallery-dl -o directory="" -o filename="{id}_{tags}" -d "%~dp0\gallery-dl\images\hypnohub" https://hypnohub.net/post?tags=splatoon and it download the first 4 files fine but then it give me OSError: [Errno 22] Invalid argument here is the verbose:

Output

[gallery-dl][debug] Version 1.8.2-dev
[gallery-dl][debug] Python 3.6.7 - Windows-10-10.0.17134-SP0
[gallery-dl][debug] requests 2.20.1 - urllib3 1.24.1
[gallery-dl][debug] Starting DownloadJob for 'https://hypnohub.net/post?tags=splatoon'
[hypnohub][debug] Using HypnohubTagExtractor for 'https://hypnohub.net/post?tags=splatoon'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): hypnohub.net:443
[urllib3.connectionpool][debug] https://hypnohub.net:443 "GET /post.json?tags=splatoon&limit=50&page=1 HTTP/1.1" 200 None
# F:\Auto upload full splatoon doujin colection\\gallery-...l_eyes splatoon symbol_in_eyes taka-michi topless towel wet
# F:\Auto upload full splatoon doujin colection\\gallery-... nintendo splatoon tech_control tentacles tongue tongue_out
# F:\Auto upload full splatoon doujin colection\\gallery-...ndo splatoon tech_control tentacles tongue tongue_out visor
# F:\Auto upload full splatoon doujin colection\\gallery-... nintendo splatoon tech_control tentacles tongue tongue_out
# F:\Auto upload full splatoon doujin colection\\gallery-... nintendo splatoon tech_control tentacles tongue tongue_out
[urllib3.connectionpool][debug] https://hypnohub.net:443 "GET //data/image/b30b984c7e231cd2ad5d55aaa533cad6.jpg HTTP/1.1" 200 137174
  F:\Auto upload full splatoon doujin colection\\gallery-...ch_control tentacles thighhighs tongue tongue_out underwear
[hypnohub][error] Unable to download data:  OSError: [Errno 22] Invalid argument: '\\\\?\\F:\\Auto upload full splatoon doujin colection\\gallery-dl\\images\\hypnohub\\77610_ahegao blush bottomless breasts breasts_outside callie_(splatoon) civibes cum cum_in_pussy dazed earrings elf_ears empty_eyes female_only femsub gloves hypnotic_accessory large_breasts lying mole nintendo open_clothes open_mouth panties pussy shirt_lift splatoon splatoon_2 spread_legs sunglasses sweat tank_top tech_control tentacles thighhighs tongue tongue_out underwear.part'
[hypnohub][debug]
Traceback (most recent call last):
  File "c:\users\mattl\appdata\local\programs\python\python36\lib\site-packages\gallery_dl\job.py", line 55, in run
    self.dispatch(msg)
  File "c:\users\mattl\appdata\local\programs\python\python36\lib\site-packages\gallery_dl\job.py", line 99, in dispatch
    self.handle_url(url, kwds)
  File "c:\users\mattl\appdata\local\programs\python\python36\lib\site-packages\gallery_dl\job.py", line 210, in handle_url
    if not self.download(url):
  File "c:\users\mattl\appdata\local\programs\python\python36\lib\site-packages\gallery_dl\job.py", line 279, in download
    return downloader.download(url, self.pathfmt)
  File "c:\users\mattl\appdata\local\programs\python\python36\lib\site-packages\gallery_dl\downloader\common.py", line 43, in download
    return self.download_impl(url, pathfmt)
  File "c:\users\mattl\appdata\local\programs\python\python36\lib\site-packages\gallery_dl\downloader\common.py", line 106, in download_impl
    with pathfmt.open(mode) as file:
  File "c:\users\mattl\appdata\local\programs\python\python36\lib\site-packages\gallery_dl\util.py", line 509, in open
    return open(self.temppath, mode)
OSError: [Errno 22] Invalid argument: '\\\\?\\F:\\Auto upload full splatoon doujin colection\\gallery-dl\\images\\hypnohub\\77610_ahegao blush bottomless breasts breasts_outside callie_(splatoon) civibes cum cum_in_pussy dazed earrings elf_ears empty_eyes female_only femsub gloves hypnotic_accessory large_breasts lying mole nintendo open_clothes open_mouth panties pussy shirt_lift splatoon splatoon_2 spread_legs sunglasses sweat tank_top tech_control tentacles thighhighs tongue tongue_out underwear.part'

mikf commented 5 years ago

There are too many tags and the filename got too long (> 255 bytes).

You can shorten the tags string to for example 200 characters with {tags[:200]}, or you use {tags:L200/too many tags/} to replace the content of {tags} with too many tags if it exceeds 200 characters.

You should also consider using a config file. It's a lot more readable than packing everything into command-line arguments.

ghost commented 5 years ago

is there no way to remove the 255 bytes limit?

mikf commented 5 years ago

No, there isn't. This is an inherent limitation of most filesystems (see Comparison of file systems (*)).

Instead of saving an image's tags in its filename, you could store it in a separate file with --write-tags.

(*) NTFS has a limit of 255 UTF-16 code units, not bytes, but that doesn't make much of a difference here.

KaMyKaSii commented 5 years ago

@mikf after almost two years using gallery-dl, I finally decided to use the archive function. I added the parameter in my configuration file, but only the new media downloaded are written to the file, the previously downloaded media are checked one by one before the command is finished. Is it possible to force previously downloaded media to be written? And do you recommend me to archive globally or archive per extractor? Thank you!

Edit1: I believe that set "skip" to "false" is a solution but I would like one that does not need to download the media files again. Edit2: "abort" is another solution but that would not force writing on the archive file, just workahound the problem

mikf commented 5 years ago

Is it possible to force previously downloaded media to be written?

See #261

And do you recommend me to archive globally or archive per extractor?

It will work either way, but maybe several smaller SQLite3 database files are better/faster than 1 massive one ... not sure. I'm not using an archive file myself, but I'd probably have a general archive file as well as individual ones for my most used sites.

indrakaw commented 4 years ago

Comment out JSON file. A simple comment, like sublime config does (//).

joon90 commented 4 years ago

Is there any way for the extractor to support search urls from Twitter? The normal download doesn't reach far enough into old tweets, but on the website you can search between specific dates.

ra9034 commented 4 years ago

How can I download tweet texts in separate txt files? I have enabled the "content" property, using the following config (don't think its correct, but I have no idea...):

{
    "extractor":
    {
        "twitter":
        {
            "content": true,
            "postprocessors": [{
            "name": "metadata",
            "mode": "custom",
            "extension": "txt",
            "format": "{content}\n"
            }]
        }
    }
}

As a result, txt files are created, but they contain only the word "None".

Also, is it possible to save the text of those tweets in which there are no media?

mikf commented 4 years ago

Comment out JSON file. A simple comment, like sublime config does (//).

Renaming the keys you want to be ignored isn't an option? (e.g. "option" -> "_option")

Is there any way for the extractor to support search urls from Twitter?

Sure, when I get to it. The Twitter extractors will have to be adjusted to the new layout etc. at some point and I might as well add support for searches then as well.

How can I download tweet texts in separate txt files? I have enabled the "content" property, using the following config (don't think its correct, but I have no idea...)

Hmm, your config file looks OK and does exactly what it's supposed to on my end:

gallery-dl -c your_config.json https://twitter.com/supernaturepics/status/604341487988576256

produces a text file with Big Wedeene River, Canada in it, like it should.

Is there a content field in the output of gallery-dl -j <tweet-url>? And what version are you running?

ra9034 commented 4 years ago

Okay, i got it :) I used old 1.8.7 version (windows executable, downloaded from mainpage in July). Now I replaced it with new 1.10.1 exe, and everything goes fine. Thanks!

And returning to the my previous message, is it possible to save the text of those tweets in which there are no media?

inthebrilliantblue commented 4 years ago

Is it possible to add to the ehentai / exhentai extractor the ability to submit an archive download to the Hentai@Home downloader? That way people can avoid spending image limits and GP?

DonaldTsang commented 4 years ago

@inthebrilliantblue care to elaborate?

inthebrilliantblue commented 4 years ago

@inthebrilliantblue care to elaborate?

On ehentai / exhentai, you can host a cache server called H@H (Hentai at Home). This gives you the ability to, when downloading an album archive through their website, to have your H@H server do it. This would be able to avoid having to load each image through gallery-dl and instead allow an ehentai user to just submit download requests through H@H.

There is an Archive Download link on each image album. Clicking it pulls up a popup that has some options for downloading. On the bottom is the H@H links for multiple quality versions. To the right is the Original Upload selection.

So my question is, using an ehentai login and search term, would it be possible to trigger the Original archive download link so that H@H downloads the album instead of gallery-dl? hhoriginallink

TheGlassEyedVillian commented 4 years ago

Is there any way to add extractor option for Reddit to save filenames as per the name of their posts?

I've tried --list-keywords option but there were no arguments for Keywords for filename in the output How should I configure the extractor so that it downloads the posts with the filename being the name of the post

DaWrecka commented 4 years ago

So I have two issues, but they both pretty much revolve around file and/or directory names.

under extensions.flickr in my config.json, I have these options set: "directory": [ "{user[username]}", "{album[title]}" ], "filename": "{user[username]}-{album[title]}_{id}.{extension}", which works fine if I'm downloading someone's albums. But sometimes I want to download images which aren't in an album - and in that case, the album title is just None in both the directory name and the file name. Is it possible to omit the {album[title]} specifier (and the trailing underscore in filename) completely if and when its value is None?

Second, until recently I've been using another, more-awkward downloader for getting images from Twitter. Despite the awkwardness I have a LOT of images downloaded from that, such that if I can get gallery-dl to download using the same filename pattern that would actually be easier than renaming all the files I already have to match gallery-dl.

The pattern used by the other downloader is roughly equal to "{author[name]}-{tweet_id}-{date:%Y%m%d_%H%M%S}-{'vid' if extension=='mp4' else 'img'}{num}.{extension}". (not-so-coincidentally, this is the pattern I've been testing as extractor.twitter.filename in my config.json) Which is to say, if the downloaded media is an MP4, then it will have the text vid in front of the index, whereas if it's an image, it'll read img. For example: NekoNicoKig-1185043490742460416-20191018_050239-vid1.mp4 NekoNicoKig-1177421053926223873-20190927_041349-img1.jpg are each names of files I have downloaded already. Now, the filename pattern I'm using above is, I'm given to understand, valid Python syntax when used in something called an f-string, (I'm using Python 3.8.2, for the record) but apparently the filename isn't an f-string in gallery-dl. That, or I'm doing something wrong. Is there anything I can do from this end? Am I doing something wrong?

mikf commented 4 years ago

@DaWrecka There are two possible ways to go about your first problem:

You could set another pair of filename/directory format strings for the image subcategory:

"directory": ["{user[username]}", "{album[title]}"],
"filename": "{user[username]}-{album[title]}_{id}.{extension}",
"image": {
    "directory": ["{user[username]}"],
    "filename": "{user[username]}-{id}.{extension}",
},

or you specify the album[title] field as optional, for example {album[title]:?/-/} (more about "special" formatting options here)

Having a different filename for videos might be a bit more involved. Either go the youtube-dl route (https://github.com/mikf/gallery-dl/issues/533), or chain a couple of replace operations to transform mp4 into vid and everything else into img: {extension:Rmp4/vid/Rjpg/img/Rpng/img/Rgif/img/}

f-strings would be really nice here, I agree, but dynamic user-specified f-strings aren't possible as far as I know, so they aren't really an option here.

dyewts commented 3 years ago

Is there a way to recursively download external content that's linked in Patreon posts? Many link to drive.google.com/drive/folders/URL, drive.google.com/file/d/URL, Imgur etc., especially since the new policies were added. I tried it with "r:patreon.com/URL" and it does follow URLs, but not the right ones. Apologies if this was already answered elsewhere.

null-zero commented 3 years ago

For patreon have a method of extracting and sorting posts by the tags and into tag folders. Currently the extractor doesn't actually save the tags

stranger-danger-zamu commented 3 years ago

So I've noticed that a lot of the bugs and suggestions for features are essentially asking for more control when manipulating the metadata for directory or filename generation.

Have you looked at using Jinja2 templating and their custom filters? A lot of the current custom formatting could be implemented as simple functions that operate on input strings or lists of strings. This can be further extended by allowing users to submit filters to be included in gallery-dl (also possible is specifying a run time import of a user-defined *.py file with their own filters).

The biggest benefit would be able to push a dictionary into the Jinja2 template and the customer filters would be able to operate on the entire object or just on attributes as well as calling any built in python function within the template.

rautamiekka commented 3 years ago

is there no way to remove the 255 bytes limit?

To expand on what @mikf said: both Linux and Window$ (IIRC; no clue about Mac) will fail to write too long names by a massively unhelpful and confusing error about corruption or not existing despite neither being the case.

And since he mentioned keywords in filenames: it's an absolutely bad idea to put keywords into filenames cuz unless the website has write-protected them you're risking the keywords being changed, which potentially changes the resulting filename, like in the case of Derpibooru where each upload gets an incremental integer, and if you opt to download with the keywords included and the keywords change => dupe file.

^ Same for anything that can change, like, for ex, if a piece is about a sunset and thus named "beautiful sunset in X.png", then gets renamed "Beautiful sunset in X.png" => dupe file on non-Window$, and depending on the website will become an invalid filename on their end. Thus, an [U]UID, combined with correct folder structure, is the best.

tutuw002 commented 3 years ago

I don't know if this has been mentioned before, but a download progress bar would be great to keep track of how much you have downloaded, especially for big files.

The above is an example from PixivUtil.

Hrxn commented 3 years ago

Not sure about that, to be honest. This might give you an ETA for a single file that is currently downloaded, but if you use gallery-dl like it's probably used by most, i.e. in downloading big galleries/collections or entire user profiles/accounts, this won't help you at all, because gallery-dl simply downloads everything returned by the site, like as it is returned by a site's API for example, and therefore a somewhat accurate prediction of the time it takes to finish the entire process is not really possible to do. So, I'm not entirely convinced about the usefulness here..

kattjevfel commented 3 years ago

On the contrary, I'd like a mode where it shows even less. I'd like it to list files downloaded, and completely skip ones already completed. I should probably make this a feature request though.

mikf / gallery-dl

Questions, Feedback and Suggestions #3 #146