documentcloud / cloud-crowd

Parallel Processing for the Rest of Us
https://github.com/documentcloud/cloud-crowd/wiki
MIT License
851 stars 92 forks source link

Ability to delete jobs #17

Closed wnoronha closed 14 years ago

wnoronha commented 14 years ago

Would be good to delete a job (well you can do this by Job.find(n).delete

Doing this should also delete the dependent work units (which does not happen)

wnoronha commented 14 years ago

Just noticed you have the :dependent attr for the work units..

jashkenas commented 14 years ago

For automatic job deletion, check out the "Cleaning Up" section on this page:

http://wiki.github.com/documentcloud/cloud-crowd/the-job-api

wnoronha commented 14 years ago

The workflow for this is: User decides to start a scrapper job for domain perl.org. Soon after the jobs/nodes have started processing this request he realizes he wants to scrape pearl.org. Need a clean way to cancel this job.

jashkenas commented 14 years ago

It's not safe to just delete the job, because a large number of other computers might be right in the middle of processing it. This is something that needs to be handled by your application. I'd recommend having your process method check the status of the model in the database before doing the work (or periodically, while doing it), and aborting if the status of the model is "cancelled".