Rewrite 5/14: I/O framework

This PR introduces the I/O framework for StorageNodes and StorageGroups, although none of the actual I/O is implemented in this PR. It also rewrites the daemon main loop to integrate the queue, workers, and new I/O framework, although, again, the existing I/O is left as-is for now and does not make use of the new asynchronous task structure.

StorageNode and StorageGroup I/O framework

This PR adds two new optional (i.e. null-allowing) string fields to both StorageGroup and StorageNode:

io_class which provides the name of the I/O class for a node/group. Node and group I/O classes are separate, and a the list of valid names is different for each. If this field is left empty, it is treated as if it had the value Default.
io_config which provides a JSON blob of configuration data, if any, to be interpreted by the I/O class. The Default I/O classes ignore this field, but others (like Nearline) will require certain data.

The I/O classes can be found in the alpenhorn/io directory, although it's pretty barren for the moment:

The base I/O classes, which don't actually do any I/O, are in the file base.py. "base" cannot be used for the io_class property. (I mean, the field can be set to that value, but it won't work, because base.py and the classes BaseNodeIO and BaseGroupIO have different capitalisation.)
The Default I/O class, which is what you get if you leave io_class empty, is implemented in Default.py. But for now, it's just a re-class of the classes in base.py to make I/O class instantiation work properly in the tests. The DefaultNodeIO class is for a "standard" POSIX filesystem. The DefaultGroupIO class is for a simple StorageGroup with only one node in it.

In general, there is no requirement that a group of a given type be populated with nodes of the same type. The DefaultGroupIO class, for instance, doesn't care at all what the io_class of its constituent node is. (There can be exceptions: NearlineGroupIO class needs a NearlineNodeIO node in it.) Some I/O classes implement only one of Group or Node I/O.

UpdateableNode and UpdateableGroup

The I/O class for a Storage object is managed by a new pair of classes defined in update.py: UpdateableNode and UpdateableGroup. These are primarily container classes for a Storage object and it's I/O class instance. They provide access to both objects for the update code.

These Updateable instances persist through update loops when possible. A node or group going away will cause the Updatable instance being destroyed. The Storage object they contain are replaced ever loop (because new Storage objects are created as a side effect of querying the database for the list of current nodes). I/O instances are only re-instanced between loops if the Storage object's id, io_config or io_class change.

UpdateableNode subsumes the old top-level functions in update.py for updating nodes (so, e.g. update_node() is now UpdateableNode.update().

Daemon start-up and main loop rewrite

This PR updates the daemon for the db changes in #144, and ties in the new functionality of the task queue (#145) and the worker pool (#146).

The daemon start-up now has the necessary db.init() call, and also now instantiates the queue and worker pool in preparation for multithreading (although there are no actual tasks yet being used in this update). The initial number of workers to start has been added to the service section of the alpenhorn config.

The main loop has been re-written for the new I/O framework. A single iteration of main loop looks like this:

The I/O update:

query the database for the list of active nodes on the host
(re-)instantiate all the I/O classes for the active nodes
Loop over nodes:
- check if the node is actually active on the host (update_node_active). If it isn't forget about this node.
- check the queue to determine whether this node is idle at the start of the loop, or if it still has on-going I/O from a previous iteration. (If it isn't idle, it's not going to be updated this time through the loop).
- build up a list of active groups on the host (by finding unique values of node.group).
- Call the I/O layer before_update hook. This function may also cancel the node update for this time through the loop
- If the node was idle at the start of the loop and the before_update hook didn't cancel the update, do all the normal node I/O updates (delete/check/&c.)
Loop over groups:
- As with nodes, call the I/O layer before_update hook. This function may also cancel the group update for this time through the loop
- If all nodes in the group were idle at the start of the loop and the before_update hook didn't cancel the update, do all the normal group I/O updates
This is the end of the "regular I/O" update.
If node or groups were idle at the end of the regular update, run low-priority "idle updates" on them. This is going to be stuff like re-checking files on Nearline to see if the restored/released state saved in the DB is correct.
Finally, run the I/O layer after_update hook on groups and then nodes.

Housekeeping in the main loop: After all the I/O updates are complete for one iteration of the loop, several housekeeping tasks are performed each update loop:

the worker pool scans for cleanly-exited workers (due to peewee.OperationalError) and restarts them
If there are no workers in the pool, some time is spent executing I/O tasks in the main loop (see serial_io()). This allows alpenhorn to fall-back to the non-multithreaded update system whenever it happens to have no workers to execute I/O tasks. (including when using a non-threadsafe DB, which will force the worker pool to be empty).
The sleep at the bottom of the loop will now wake up early if a global_abort is triggered by a worker thread.

This rewrite of the main loop is essentially complete. A few, relatively small, changes will happen in a future PR when the auto_import module is updated. Additionally, most of the I/O update code in update.py will be moved into the I/O classes.

Broken by this PR

This PR comments out all the actual I/O updates in the main loop so that I can run tests on the framework. This will be resolved in the later DefaultIO PRs (#150 et seqq.)

An afterthought

I think this is sufficient to declare this PR closes #60

radiocosmology / alpenhorn