mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.33k stars 924 forks source link

[v2.0] Overview - Planned Features #5006

Open mikf opened 8 months ago

mikf commented 8 months ago

A rough, incomplete overview of features and changes I want to implement in v2.0: (will be updated as time goes on)

Let me know if I should be more specific on a topic.

Possible things that might also happen:

ghbook commented 8 months ago

Very much interested in support of this feature in v2.0. Config file is getting huge due to lack of inheritance between two different categories. Not sure if this is possible with new changes. https://github.com/mikf/gallery-dl/discussions/4632

thatfuckingbird commented 8 months ago

Some thoughts from my POV, building an application relying heavily on gallery-dl:

This will hurt, but is understandable. Overall I don't worry too much about it because the metadata files already change sometimes (mostly due to fixes and expansions) and even the optimal configuration changes sometimes, like when new better options are introduced. So sometimes having to update configs is part of the deal already. What I hope is that the ability to specify some options on the command line and some in config files remains (and also using multiple config files like now).

I hope this will be done in a backward compatible manner (or at least easy to migrate). I quite like the current simple format of the archive, personally I wouldn't complicate it unless its actually needed for some feature. Also have to be careful on post vs. file distinction, even on sites where these are not the same, posts can be updated later with more images (so recording just posts in the archive would skip some stuff).

The dream would be to have some kind of support for continuing gallery downloads where they were left off (even if lets say the download is aborted due to the system crashing). Unfortunately this could only be achieved with site-specific support in each extractor. Might be something to leave as an optional thing for each extractor, but not even sure its worth the work and complexity on this level.

  • a better method of mapping URLs to extractors

    • this currently involves a linear search through regular expressions somehow good enough, but everything but efficient

I think linear search is just fine. What I would like though is to have some way to expand the regex in the configuration so when a new site like fxtwitter, vxtwitter etc. pops up it can be handled on the configuration level instead of updating the extractor code.

+1 wish: please keep the log format the same or at least as nicely parseable as it is now.

Edit: forgot to say but I do like the general direction things are going, looking forward to hearing more esp. about the extractor rework.