Open reyaz006 opened 2 years ago
The archive option is exactly that. You need to run through all already downloaded galleries to get them written after enabling.
I would also recommend using -A, --abort
/ "skip": "abort:5"
when updating your collection and the source returns its newest files first, which is most likely the case here. Saves a bit of time by not having to go through all files again, most of which you had already downloaded previously.
I'm going to see how archive
works then.
But this brings another question. What if some (existing and downloaded) file is updated on server and I want to (a) get the updated file and/or (b) still keep the older file for review/archiving?
Depends on the website.
I assume the website provides file modification dates and sizes on URL requests. Or is it only implemented for specific portals?
What's the website ?
The website is kemono party.
I know nothing about it, so I'll leave this to everyone else.
I'm going to see how
archive
works then.But this brings another question. What if some (existing and downloaded) file is updated on server and I want to (a) get the updated file and/or (b) still keep the older file for review/archiving?
I'm having this issue too. For now, I've {_now}
in my config. It adds a creation timestamp after kemono.party's number order, this makes it a lot easier to see which post {id}
has been updated, simply see if you've two files with the same number in the folder. Also, use a separate archive for kemono.party and change that archive format to hash "archive-format": "{service}_{user}_{id}_{hash}"
.
"directory": [
"{service} kemono.party {user[:100]}",
"{id} [{date!s:.10}]"
],
"filename": "{num:>03} {_now!s:.16} {filename[:25]}.{extension}"
But this brings another question. What if some (existing and downloaded) file is updated on server and I want to (a) get the updated file and/or (b) still keep the older file for review/archiving?
Jumping in here to state the filename
field for Kemonoparty has changed at some point, which I have only recently been aware of and reference in #2603
My archive-format
is "{service}_{user}_{id}_{num}"
but I can see that {hash}
might prove more useful.
What if some (existing and downloaded) file is updated on server and I want to (a) get the updated file and/or (b) still keep the older file for review/archiving?
gallery-dl is not really made with downloading a newer version of an already downloaded file in mind.
For kemono, you might be able to achieve something to that effect by using the timestamp from {edited}
and/or the value from {hash}
in filenames/archive-keys.
There is also a compare
post processor to potentially enumerate different versions of the same file, but this method is really inefficient.
{date!s:.10}
That's a rather creative way of formatting a datetime value. Turns out this is actually more than twice as fast as the "proper" way ({date:%Y-%m-%d}
). Faster still would be {date!s:[:10]}
Example:
I was thinking of some simple ignore list in a text file that could be put into the output folder, containing relative names of the files to not be downloaded on next runs. So I could manually set up a separate ignore list for each gallery.
Is it possible? If not, any chance to implement this?