Wiredcraft / executioner

The executioner will .. execute commands via ansible
0 stars 2 forks source link

Avoid timeouts for long running tasks #2

Open JuhaS opened 11 years ago

JuhaS commented 11 years ago

Some tasks may run long time causing a timeout of the HTTP connection. One possible solution is to return a job-id after starting the ansible task. Then ui can poll for the result of the task with the job-id. I am not sure how the data should be stored in the server when it's waiting for the ajax poll (just have a job-id->result map in the main api class?).

zbal commented 11 years ago

That's what I was referring to with the 7th point in #1 - sending a "task" as a POST and receiving a HTTP/201 with the id (processing) is usually the best practice that lets you fetch the status of the specific task later on.

You can store the status the way you want, either a simple redis or even in file. It's up to you. What would you suggest and why?

JuhaS commented 11 years ago

I think I would just store the python dictionary (:) to some of the handlers, probably RunCommandHandler. I'm still new to the async server world, but this is seems logical for following reasons:

1) The Twisted has one threaded event handler. This means the callbacks from deferred and request handlers (for example RunCommandHandler) can't run concurrently. This means we can safely assume there is no race condition if handler reads the map when it gets poll request and deferred callback updates the map with results.

2) There is only one instance of the server. Because of this, we don't have to share the data between nodes.

3) Server won't be restarted often so persistent data store is not needed (like filesystem). The ansible-tasks would get interrupted anyway in case of server restart because they are within the same process.

If any of these were not true I would prefer to go with proper in-memory cache/db like Redis or Memcached that can also share same state between nodes.

It's probably good idea to have a timestamp inside the map-entries so that we can clear old job-entries if browser never came to fetch them (for example browser is closed after starting long job). Like this: dict(: dict("timestamp":, "result":) .