Jules-WinnfieldX / CyberDropDownloader

Bulk Gallery Downloader for Cyberdrop.me and Other Sites
GNU General Public License v3.0
1.6k stars 185 forks source link

[BUG] CDL seems to mark everything downloaded before even downloading. #625

Closed cewong2 closed 9 months ago

cewong2 commented 9 months ago

So I added way too many links than I had space for, so with the absence of a pause button (which is unfeasible in a Python script I think, as this isn't a GUI) I did a break command because I was running out of space and needed to clean that up. Upon rerunning the command for good measure as well, I deleted links that were already completed to make the scraping faster (I though that setting 'Ignore_Cache: false' would make that faster, but it performs the scrape every time).

So at this point, CDL has processed maybe 1/5 of my download URL list. I sent a break command because I ran out of space, deleted known completed links, and freed up space. When I ran CDL again, it skipped all the remaining downloads. I tried this with various combinations in the config, altering 'ignore_history' and 'skip_download_mark_completed' if either is set to true, it will just skip all the remaining links even though it never processed the actual download. So for at the least the 'skip_download_mark_completed' it's marking everything as downloaded even though it didn't actually download the file. It even ignored the ".part" file in the directory and skipped it still and I can't actually resume my URL list. If I set both to false, it will just download everything again skipping only existing files (which is the correct behavior from what I can tell).

TLDR; It seems regardless if CDL has actually processed a download for a link, it is marking the file as downloaded if it has been scraped before.

This was with host gofile.io

Jules-WinnfieldX commented 9 months ago

'skip_download_mark_completed' does exactly that, it skips the download and marks the file as downloaded in the DB.

'ignore_history' will download everything you give the program regardless of whether it's been completed in the past or not.

The cache (you mention ignore_cache) is only used for coomer/kemono and will speed up the program on subsequent runs of those links when it remains false.

I'd recommend you look through the wiki.