Open mikf opened 8 months ago
Very much interested in support of this feature in v2.0. Config file is getting huge due to lack of inheritance between two different categories. Not sure if this is possible with new changes. https://github.com/mikf/gallery-dl/discussions/4632
Some thoughts from my POV, building an application relying heavily on gallery-dl:
- config changes ([v2.0] Configuration File Changes #2203)
- better, more consistent names for metadata & options ([Enhancement] Unify the many keyword naming schemes #1646)
This will hurt, but is understandable. Overall I don't worry too much about it because the metadata files already change sometimes (mostly due to fixes and expansions) and even the optimal configuration changes sometimes, like when new better options are introduced. So sometimes having to update configs is part of the deal already. What I hope is that the ability to specify some options on the command line and some in config files remains (and also using multiple config files like now).
--download-archive
rework
- use archive for posts/etc and not just files (DeviantartExtractor - check archive sooner #317)
- use highly advanced SQL features like
tables
- native continuation/cursor and update support
I hope this will be done in a backward compatible manner (or at least easy to migrate). I quite like the current simple format of the archive, personally I wouldn't complicate it unless its actually needed for some feature. Also have to be careful on post vs. file distinction, even on sites where these are not the same, posts can be updated later with more images (so recording just posts in the archive would skip some stuff).
The dream would be to have some kind of support for continuing gallery downloads where they were left off (even if lets say the download is aborted due to the system crashing). Unfortunately this could only be achieved with site-specific support in each extractor. Might be something to leave as an optional thing for each extractor, but not even sure its worth the work and complexity on this level.
a better method of mapping URLs to extractors
- this currently involves a linear search through regular expressions somehow good enough, but everything but efficient
I think linear search is just fine. What I would like though is to have some way to expand the regex in the configuration so when a new site like fxtwitter, vxtwitter etc. pops up it can be handled on the configuration level instead of updating the extractor code.
+1 wish: please keep the log format the same or at least as nicely parseable as it is now.
Edit: forgot to say but I do like the general direction things are going, looking forward to hearing more esp. about the extractor rework.
A rough, incomplete overview of features and changes I want to implement in v2.0: (will be updated as time goes on)
Let me know if I should be more specific on a topic.
--download-archive
reworktables
cursor
andskip: abort
, but automated--filter
rework--range
,--filter
, etc into one unified optionPossible things that might also happen: