Closed ketiltrout closed 1 year ago
I've made a fairly major update here to address a number of comments on this PR:
I've created new container objects (in update.py
) called UpdateableNode
and UpdateableGroup
. They contain both the StorageNode/Storagegroup
and the I/O instance and removes the circular references between the two.
These new objects take infrastructural elements that I had sprinked around the code base for lack of better places to put them. Some of these had been put in the I/O base classes (in io/base.py
) as base I/O class methods which were not meant to be re-implemented in subclasses, which was awkward.
The parts now unified in these new classes are:
_get_io_class
) which I had put in storage.py
io/base.py
io/base.py
It also provides a better way to do I/O instantiation (including removing the awkward set_queue
method, and node-vetting functionality of the group I/O classes.
These classes also now subsume the update_node
/update_group
functions (e.g. update_node()
is now UpdateableNode.update()
) and their sub-functions, though most of those will be moved into the new classes as they're updated.
The UpdateableNode/Group instances persist through main loop iterations, which also means the I/O classes do too, which I think is a more expected behaviour. The top-level containers have logic to re-initialise the I/O classes if necessary.
On the whole, I think this a good improvement to both the legibility and the operation of the code.
This PR introduces the I/O framework for
StorageNode
s andStorageGroup
s, although none of the actual I/O is implemented in this PR. It also rewrites the daemon main loop to integrate the queue, workers, and new I/O framework, although, again, the existing I/O is left as-is for now and does not make use of the new asynchronous task structure.StorageNode and StorageGroup I/O framework
This PR adds two new optional (i.e. null-allowing) string fields to both
StorageGroup
andStorageNode
:io_class
which provides the name of the I/O class for a node/group. Node and group I/O classes are separate, and a the list of valid names is different for each. If this field is left empty, it is treated as if it had the valueDefault
.io_config
which provides a JSON blob of configuration data, if any, to be interpreted by the I/O class. The Default I/O classes ignore this field, but others (like Nearline) will require certain data.The I/O classes can be found in the
alpenhorn/io
directory, although it's pretty barren for the moment:base.py
."base"
cannot be used for theio_class
property. (I mean, the field can be set to that value, but it won't work, becausebase.py
and the classesBaseNodeIO
andBaseGroupIO
have different capitalisation.)io_class
empty, is implemented inDefault.py
. But for now, it's just a re-class of the classes inbase.py
to make I/O class instantiation work properly in the tests. TheDefaultNodeIO
class is for a "standard" POSIX filesystem. TheDefaultGroupIO
class is for a simple StorageGroup with only one node in it.In general, there is no requirement that a group of a given type be populated with nodes of the same type. The
DefaultGroupIO
class, for instance, doesn't care at all what theio_class
of its constituent node is. (There can be exceptions:NearlineGroupIO
class needs aNearlineNodeIO
node in it.) Some I/O classes implement only one of Group or Node I/O.UpdateableNode and UpdateableGroup
The I/O class for a Storage object is managed by a new pair of classes defined in
update.py
:UpdateableNode
andUpdateableGroup
. These are primarily container classes for a Storage object and it's I/O class instance. They provide access to both objects for the update code.These Updateable instances persist through update loops when possible. A node or group going away will cause the Updatable instance being destroyed. The Storage object they contain are replaced ever loop (because new Storage objects are created as a side effect of querying the database for the list of current nodes). I/O instances are only re-instanced between loops if the Storage object's
id
,io_config
orio_class
change.UpdateableNode
subsumes the old top-level functions inupdate.py
for updating nodes (so, e.g.update_node()
is nowUpdateableNode.update()
.Daemon start-up and main loop rewrite
This PR updates the daemon for the db changes in #144, and ties in the new functionality of the task queue (#145) and the worker pool (#146).
The daemon start-up now has the necessary
db.init()
call, and also now instantiates the queue and worker pool in preparation for multithreading (although there are no actual tasks yet being used in this update). The initial number of workers to start has been added to theservice
section of the alpenhorn config.The main loop has been re-written for the new I/O framework. A single iteration of main loop looks like this:
The I/O update:
update_node_active
). If it isn't forget about this node.node.group
).before_update
hook. This function may also cancel the node update for this time through the loopbefore_update
hook didn't cancel the update, do all the normal node I/O updates (delete/check/&c.)before_update
hook. This function may also cancel the group update for this time through the loopbefore_update
hook didn't cancel the update, do all the normal group I/O updatesafter_update
hook on groups and then nodes.Housekeeping in the main loop: After all the I/O updates are complete for one iteration of the loop, several housekeeping tasks are performed each update loop:
peewee.OperationalError
) and restarts themserial_io()
). This allows alpenhorn to fall-back to the non-multithreaded update system whenever it happens to have no workers to execute I/O tasks. (including when using a non-threadsafe DB, which will force the worker pool to be empty).global_abort
is triggered by a worker thread.This rewrite of the main loop is essentially complete. A few, relatively small, changes will happen in a future PR when the
auto_import
module is updated. Additionally, most of the I/O update code inupdate.py
will be moved into the I/O classes.Broken by this PR
This PR comments out all the actual I/O updates in the main loop so that I can run tests on the framework. This will be resolved in the later DefaultIO PRs (#150 et seqq.)
An afterthought
I think this is sufficient to declare this PR closes #60