Open reyaz006 opened 6 years ago
Also noticed now that CPU usage gets really high when such problems start happening. It goes to 25%, this means it eats 2 cores from my 8-core (4-core with hyper-threading) CPU.
Try to increase the retry delay (I've set to 60s)?
I am also now experiencing the
2018-03-24 07:53:48,481 ERROR [ 14] - Error when downloading: http://pic04.nijie.info/nijie_picture/diff/main/79346_20161014001840_0.gif to ***\188550_p2_EDGEで落書きシリーズ_79346_20161014001840_0.gif.!nijie ==> The operation timed out I still can't understand what it means. If I enter that url in my browser, it loads fine. Even if I reload it many times.
Also, everything is very slow. Logging in takes like, 30 or 60 seconds, pulling all the bookmarked users takes long as well. Pulling all bookmarked images is as quick as it was.
Well, it started working now, after 42 minutes, and I didn't do anything to cause the change, not that I'm aware of. I was just retrying while fiddling with the settings and after each setting change I'd re-attempt. Now when it started working I've done no setting change at all, I was just randomly re-attempting.
Well.. 32 images have been downloaded but it's stopped again..
Recently I've found out that there is an issue with how NijieDownloader operates when "Skip if already downloaded in DB" is enabled: it completely skips a post if it was already marked as downloaded. Meanwhile, it appears some artists upload new images inside the post. E.g. some post contained 2 images, and few weeks later it contains 12 images. If I already got those 2 images before and "Skip if already downloaded in DB" is enabled, the rest 10 images will never be downloaded. I think it deserves a separate issue, but decided to describe another issue which seems more problematic.
So I decided to disable this option and started to re-process the whole batch list. Of course I've set it to only overwrite files if the file size is different, and backup old files.
More problems emerged:
Concurrent jobs. I always used 8, but it seems it won't work if I want to re-process all the list. At some point all new requests start failing with http response "429 Too Many Requests". It'll be probably great if NijieDownloader could detect such errors and not skip such downloads but wait instead. Anyway, I decreased concurrent jobs to 2 and it helped, even though it drastically increased the time required for finishing the whole job.
At some point, I noticed that some member-jobs appear to generate too many errors, and they are stuck at trying to download some image. I discovered the error
2018-03-24 07:53:48,481 ERROR [ 14] - Error when downloading: http://pic04.nijie.info/nijie_picture/diff/main/79346_20161014001840_0.gif to ***\188550_p2_EDGEで落書きシリーズ_79346_20161014001840_0.gif.!nijie ==> The operation timed out
I still can't understand what it means. If I enter that url in my browser, it loads fine. Even if I reload it many times.I also noticed that it takes too much time for NijieDownloader to move from processing one image to another at some points. After stopping and starting several times, it seems to just freeze at some images, and not trying to retry the download. I can only see something like "Saving to: ..." and it doesn't move over that. I can't reproduce it after restarting though. Maybe it starts happening only after some time or events.
The Retry delay option may be bugged. I had it set to 5, but I could see from the log that it takes 60-65 seconds before next attempt for one image is recorded. Could it be that 60 seconds is also how much time it waits before marking the attempt as failed with "The operation timed out" error?
My guess is that Nijie sends 429 error or not sending anything as a measure against frequent requests (even though I can't reproduce this in browser). If this is true, we may need an option to set a delay between each file request, and have such cases properly detected, not skipped.