ArchiveTeam / ArchiveBot

ArchiveBot, an IRC bot for archiving websites
http://www.archiveteam.org/index.php?title=ArchiveBot
MIT License
357 stars 71 forks source link

Implement a !pause/!resume command #162

Open wickedplayer494 opened 9 years ago

wickedplayer494 commented 9 years ago

Currently, if a person wants to effectively pause a job, they have to send in both a !delay command and a !con command to set a job to use say 1 worker with a 120s delay.

This could be made much more efficient by implementing a !pause command to complement the !yahoo command for the other end of the intensity scale. In the event of a site going down during a grab, with the pause command, the job could automatically be set to use 1 worker with a 120-180s delay (or something even higher or lower than 120-180s) if a user were to issue a !pause in IRC on a specific job.

When a site comes back up, a !resume command would also be handy. This would restore the settings that were used prior to the issuing of a !pause command, intended for use when it's known for sure ArchiveBot grabs aren't the reason why the site is going down.

hannahwhy commented 9 years ago

I'd like to reserve !resume as the counterpart to !suspend.

It would be nice if !pause didn't use the delay mechanism, but instead halted that job and kept that job listening for updates, so that e.g. an !unpause would take effect immediately. This also means that the concurrency and delay specs are never touched.

!delay and !con often do go hand-in-hand. Furthermore, the delay range is often not used. We could add !pause and combine !delay and !con into a single command:

!pause IDENT
!unpause IDENT
!speed IDENT 3  # concurrency 3, leave delay untouched
!speed IDENT 3 250   # concurrency 3, delay 250 ms
!speed IDENT 3 250 375  # concurrency 3, delay 250-375 ms

!delay and !con will remain around for as long as it takes for people to switch over.

Thoughts?

wickedplayer494 commented 9 years ago

A complete halt would be much better for sure. Also, good thinking on merging !delay and !con. I'd have no qualms, especially if !pause meant stop everything, but wait for any !unpause commands.

chfoo commented 9 years ago

Wpull supports setting the engine concurrency to 0. I don't remember why it wasn't allowed in the bot in the first place.

hannahwhy commented 9 years ago

https://github.com/ArchiveTeam/ArchiveBot/blob/master/pipeline/wpull_hooks.py#L163-L164 handles concurrency changes. With engine concurrency zero, would that hook ever fire again?

Concurrency zero is definitely one way to handle this, but I don't think ArchiveBot's hooks would be able to recover from that situation. AFAIK, we would have to move the settings management code to hook into the event loop in another way. (I'm not opposed to that.)

chfoo commented 9 years ago

I guess the Trollius event loop can be passed to the settings thread and then schedule the set_concurrent call on Trollius event loop.

Or don't do that and just make it busy wait in the hook.