Open alidan opened 3 years ago
Ok the problem with downloaders jamming up has happened again, I cut the session and restarted making it, and tried to load test it, essentially adding every single thread on 4chan I had a passing interest in. Initially, everything added and was a non issue, but once the session crept up to around 1.5-2 million hydrus needs constant restarts. I removed my old downloader I was passing and tried again in 453 but the issue seems to persist even without the old downloader.
version 454 still has downloaders jamming, however instead of pending they seem to jam at initializing now.
my current session weight is 4.4 million and I have loaded it as hard as I can with downloaders as I can without creating a need to remove everything they downloaded, a good chunk of this is load for load sake to test, another chunk if I don't like getting rid of watchers because for parsing sake having them grouped by theme is too useful.
there are in total 1594 watchers there are 972 watchers that are "active" (threads that may be still going or just haven't 404'd yet) and there are 599 dead threads.
I think a good chunk of these were pending when I updated, seeing as I woke up updated and went back to sleep, so everything may have just failed due to it being hammered for requests. it my have gotten stuck elsewhere, not sure, i'm going to restart cycle this till everything is no longer pending or downloading and see if it still gets stuck
Ok fully rechecked, that was far less painful then it was with the prior version where it could stop almost immediately, however I have had a few downloader hangs, but not that everything is clear this will be the most fair.
5 hours since, none of the watchers have stopped checking however there are some things stuck on initializating overall this is far better then it was before 454 going to stress test it a bit in a few days when I can load up a lot of threads and see how it handles that.
okas of about 40 minutes ago, thing were still moving, however these two hangs have now caused watchers to start hanging on pending, restarting and seeing when the next time this happens is.
ok, after adding a few watchers, I think I may have just been incredibly lucky with how smooth things where, it still feels marginally easier to recover pending watchers and they may not fall into a pending loop as easily, im unsure.
ok now, after a few days of not getting a watcher hang or an download getting stuck, I added around 200-250 watchers as a stress test, and nothing hanged, i'm not 100% sure what happened, because I was getting hangs before, but now it seems to be going fine.
Hey, I apologise for not responding to this thread earlier. I have used your report while working on parts of this problem for several weeks now. I hope tomorrow's release will have another improvement.
This 'getting stuck on "initialising..."' while also having a position of 'running' is an odd one. It seems that these jobs are being scheduled and are getting through bandwidth and login checks without any delays, but once they try to actually make a connection and start sending bytes, nothing happens. Could be a thread deadlock, could be some OS level network problem causing huge delays, could be something else.
Although your latest posts say things are better, I will keep working in and around this code. Some of these downloader scheduling problems are due to a legacy bottleneck. I do have a plan to completely remove this, and allow hundreds of downloaders to operate in parallel, but it will be a slightly larger rewrite of the core import job.
Ok, like I said on one of my responses, I have absolutely no idea why, when I first moved to 454 I was still getting issues, but I think 2 sticking points (about 12 restarts in total) I haven't had a single issues in about 19 days, and this is after dumping enough files into downloaders to get 50gb~ of files, i have since stopped stressing it, but it seems the problem no longer exists. my only real thought on it was if this issue did not go away a fall back of 'if x takes y long, assume dead and free up/add additional slots' but it seems its not needed anymore.
Hydrus version
started on 447 with issues cropping up on 451 and 452
Operating system
Windows other (specify in comments)
Install method
Installer
Install and OS comments
windows 7 64bit 64gb of ram
Bug description and reproduction
Log output