MusicConnectionMachine / UnstructuredData

In this project we will be scanning unstructured online resources such as the common crawl data set
GNU General Public License v3.0
3 stars 1 forks source link

Send kill signal to workers #180

Closed sacdallago closed 7 years ago

sacdallago commented 7 years ago

When finished processing a batch, the master should send a kill signal to the workers OR append a && shutdown 0 to the worker after completing the script which batch processes the WAR files.

please also check: does shutdown 0 actually undeploy** the machine? This you can check in a very naïve way by: calling shutdown 0 on any worker, looking at the azure portal what it says for that machine: it should give you the option to start the machine and there should be no warning of the type

felixschorer commented 7 years ago

Our app already terminates when all work has been done. All processes will exit in a nicely manner.

We'll test auto shutdown tomorrow.

sacdallago commented 7 years ago

yeah sure the app terminates, but no the VM on which it is running :) that is money

lukasstreit commented 7 years ago

This does not work, I tried it out. We can instead use a script that runs something like "az vm undeploy ..." for each vm once the processing is finished. This would have to be done manually though.

We might also be able to write a script that regularly checks queue size and runs the command once the queue is empty.

felixschorer commented 7 years ago

We might also be able to write a script that regularly checks queue size and runs the command once the queue is empty.

Workers don't remove items from the queue immediately though. They just set an item to invisible for 30 minutes and only remove it if they've finished processing it. So the question is, will the queue be shown empty when there a still items which are set to invisible?

lukasstreit commented 7 years ago

Good point. We should stick with with manual shutdown for now in my opinion. progress-checker.js can be used to check how far along our processing is.

felixschorer commented 7 years ago

We should still add && shutdown 0 though. That way we only pay for the hardware and not the usage as well? It'll also indicate, that a worker has finished.

pfent commented 7 years ago

uhm, might sound crazy, but: Couldn't a VM undeploy itself with az vm undeploy once it is finished? Might be worth just trying that on a separate VM for 💩 and 😆 …

lukasstreit commented 7 years ago

I had that thought as well but to me the problem is that the az commands require you to manually follow a login link and enter a code to login.. I didn't have any ideas yet about how to circumvent that :( Otherwise it would probably be possible

kordianbruck commented 7 years ago

As we changed to the queue, this should be less of a problem?

A worker will work till theres no work left. The case of previously idle workers will happen like almost never. We just need a cleanup routine that undeploys everything once the queue is empty.

felixschorer commented 7 years ago

[...] once the queue is empty.

Well, as I wrote earlier, we should test what "empty" means. Does it mean there are no items at all in the queue or does it mean there are no visible items left. Workers set queue items to invisible for 30 minutes and only really remove them if they've offloaded everything to the storage / DB.

EDIT: We've just tested it. Invisible items still count towards the queue size.

felixschorer commented 7 years ago

Now, that we have queue monitoring implemented, we can close this.