Open ivan opened 8 years ago
I don't even know if this is possible to implement nicely (i.e. not breaking any existing responses being downloaded) with the wpull hooks that exist now
What about the wait_time hook? I believe that happens at the end after the file has been added to the warc. Here is a basic sketch: (Forgive me for not knowing the proper terminology, so I will use CG to refer to the proper event loop idea of a thread.)
In the wait_time hook, a CG can check if the pause file exists and, if so, set an appropriate locking mechanism, set concurrency to 1 and spin away on an while loop whose condition is if the file exists and if so calls an appropriate sleep function. Other CGs would check for whether the file is there, but skip the section with the lock and return and end. When the pause file is deleted, the CG in the locked section sets the concurrency to what is in the concurrency file and then releases the lock before returning the delay time.
This is better than
kill -STOP pid
because1) it allows grab-site to keep receiving control messages, once we implement those
2) it doesn't require looking up the pid or using pgrep