radiocosmology / alpenhorn

Alpenhorn is a service for managing an archive of scientific data.
MIT License
2 stars 1 forks source link

Rewrite 1/10: preliminaries and CHIMEDB integration #144

Closed ketiltrout closed 1 year ago

ketiltrout commented 1 year ago

Overview

This PR starts out by doing some maintenance, and then does updates db.py and extensions.py to integrate with the new chimedb.core.alpenhorn module I've introduced over in https://github.com/chime-experiment/chimedb/pull/34.

Housekeeping

The housekeeping work done to prime this re-write is as follows:

Changes to what a database extension provides

The essential change here is that instead the register_extension function of a database extension returning an function that initialises peewee.Database, it instead returns a dict of "capabilities", which includes a key called "connect" key that provides the peewee.Database-making function.

The capability dict also contains a close function to close the database (probably unnecessary) and a boolean indicating whether the database extension is threadsafe or not.

If no database extension module is provided, or one or more of these keys is missing from the capability dict, then alpenhorn's db.py provides fallback implementations of all of these. The fallback is not threadsafe.

(Implementation note: I originally had everything that chimedb.core could possibly provide in this capability dict, but as the alpenhorn code matured, I came to the conclusion that more and more of it was more trouble than it was worth, and I've now pared it down to this minimal set.)

The function extensions.connect_database_extension has largely been replaced by db.init which sets-up the db module based on whatever database extension was found (if any). The extensions module also now complains if more than one database extension is specified in the config (since only one of them could possibly be used).

Operationally, the changes here mean that alpenhorn start-up now needs to have a

db.init()

call after extensions are loaded but before the first db.connect() call to set up the db module. Additionally, each thread that needs to access the database also needs to call db.connect() before doing that.

Peewee-3 changes

I have ported the fixes made in chimedb.core to both the mixin RetryOperationalError and the EnumField class so that they work in peewee-3. See https://github.com/chime-experiment/chimedb/pull/34 for details. In practice, when using the CHIME database extension, we don't use the alpenhorn-provided RetryOperationalError. We do use the EnumField defined here, but don't need the fix because MySQL has a native Enum type making the fix unnecessary.

ketiltrout commented 1 year ago

I realised I can't make stacked PRs if they're off in some other fork, so I've remade #143 locally.

ketiltrout commented 1 year ago

Mostly doc updates and type hinting. I've updated the minimum version to 3.10 for type hinting.

I also found some more unittests in tests/test_extensions.py that were supposed to be in this PR but got overlooked the first time round.

Re-blackened for the 2023 version of black.