radiocosmology / alpenhorn

Alpenhorn is a service for managing an archive of scientific data.
MIT License
2 stars 1 forks source link

Rewrite 5/14: I/O framework #148

Closed ketiltrout closed 1 year ago

ketiltrout commented 1 year ago

This PR introduces the I/O framework for StorageNodes and StorageGroups, although none of the actual I/O is implemented in this PR. It also rewrites the daemon main loop to integrate the queue, workers, and new I/O framework, although, again, the existing I/O is left as-is for now and does not make use of the new asynchronous task structure.

StorageNode and StorageGroup I/O framework

This PR adds two new optional (i.e. null-allowing) string fields to both StorageGroup and StorageNode:

The I/O classes can be found in the alpenhorn/io directory, although it's pretty barren for the moment:

In general, there is no requirement that a group of a given type be populated with nodes of the same type. The DefaultGroupIO class, for instance, doesn't care at all what the io_class of its constituent node is. (There can be exceptions: NearlineGroupIO class needs a NearlineNodeIO node in it.) Some I/O classes implement only one of Group or Node I/O.

UpdateableNode and UpdateableGroup

The I/O class for a Storage object is managed by a new pair of classes defined in update.py: UpdateableNode and UpdateableGroup. These are primarily container classes for a Storage object and it's I/O class instance. They provide access to both objects for the update code.

These Updateable instances persist through update loops when possible. A node or group going away will cause the Updatable instance being destroyed. The Storage object they contain are replaced ever loop (because new Storage objects are created as a side effect of querying the database for the list of current nodes). I/O instances are only re-instanced between loops if the Storage object's id, io_config or io_class change.

UpdateableNode subsumes the old top-level functions in update.py for updating nodes (so, e.g. update_node() is now UpdateableNode.update().

Daemon start-up and main loop rewrite

This PR updates the daemon for the db changes in #144, and ties in the new functionality of the task queue (#145) and the worker pool (#146).

The daemon start-up now has the necessary db.init() call, and also now instantiates the queue and worker pool in preparation for multithreading (although there are no actual tasks yet being used in this update). The initial number of workers to start has been added to the service section of the alpenhorn config.

The main loop has been re-written for the new I/O framework. A single iteration of main loop looks like this:

The I/O update:

Housekeeping in the main loop: After all the I/O updates are complete for one iteration of the loop, several housekeeping tasks are performed each update loop:

This rewrite of the main loop is essentially complete. A few, relatively small, changes will happen in a future PR when the auto_import module is updated. Additionally, most of the I/O update code in update.py will be moved into the I/O classes.

Broken by this PR

This PR comments out all the actual I/O updates in the main loop so that I can run tests on the framework. This will be resolved in the later DefaultIO PRs (#150 et seqq.)

An afterthought

I think this is sufficient to declare this PR closes #60

ketiltrout commented 1 year ago

I've made a fairly major update here to address a number of comments on this PR:

I've created new container objects (in update.py) called UpdateableNode and UpdateableGroup. They contain both the StorageNode/Storagegroup and the I/O instance and removes the circular references between the two.

These new objects take infrastructural elements that I had sprinked around the code base for lack of better places to put them. Some of these had been put in the I/O base classes (in io/base.py) as base I/O class methods which were not meant to be re-implemented in subclasses, which was awkward.

The parts now unified in these new classes are:

It also provides a better way to do I/O instantiation (including removing the awkward set_queue method, and node-vetting functionality of the group I/O classes.

These classes also now subsume the update_node/update_group functions (e.g. update_node() is now UpdateableNode.update()) and their sub-functions, though most of those will be moved into the new classes as they're updated.

The UpdateableNode/Group instances persist through main loop iterations, which also means the I/O classes do too, which I think is a more expected behaviour. The top-level containers have logic to re-initialise the I/O classes if necessary.

On the whole, I think this a good improvement to both the legibility and the operation of the code.