radiocosmology / alpenhorn

Alpenhorn is a service for managing an archive of scientific data.
MIT License
2 stars 1 forks source link

Rewrite 4/10: Table model updates #147

Closed ketiltrout closed 1 year ago

ketiltrout commented 1 year ago

This PR updates the peewee table models for the data index (StorageGroup, StorageNode, ArchiveAcq, ArchiveFile, ArchiveFileCopy, ArchiveFileCopyRequest). It does not deal with AcqType or FileType, which will be handled in a later PR.

This PR updates the field lists for these tables (adding new fields and removing unused fields). All field updates are backwards compatible with alpenhorn-1. It also add a few convenience methods for various common database queries related to the tables. The methods are mostly database queries that were performed explicitly within the I/O code. Moving these queries to methods here is to reduce the complexity of the I/O code itself.

StorageGroup and StorageNode will have a subsequent update to add the new I/O framework, but the other table models (ArchiveAcq, ArchiveFile, ArchiveFileCopy, ArchiveFileCopyRequest) have no further changes pending.

StorageGroup

Fairly light changes:

StorageNode

A more extensive update.

StorageNode fields

ArchiveAcq

Only change:

ArchiveFile

Field changes:

Properties and methods:

ArchiveFileCopy

Fields:

Properties and Methods:

ArchiveFileCopyRequest

Field changes:

Other changes

Because it's used in the StorageNode.local property, this is a convenient place to update the hostname-finding logic: The function alpenhorn.util.get_short_hostname is renamed to alpenhorn.util.get_hostname and now supports specifying the hostname explicitly in the alpenhorn YAML config, giving us the flexibility to put logical values in StorageNode.host (like, say, "scinet"), instead of being forced to use whatever value the actual name of the host we're running on has (like, say, "nia-dm1").

The unit tests for (at least the updated parts of) alpenhorn.acqusition, alpenhorn.archive, and alpenhorn.storage have been restored/updated after being disabled in #144. (They were also renamed to drop _model from the name of the test files, because I like testing alpenhorn/<name>.py in tests/test_<name>.py).

A large set of DB-data-producing fixtures has been added to conftest.py. These will be used extensively by the I/O framework unit tests.

ketiltrout commented 1 year ago

I made ArchiveFile.archive_count and StorageNode.under_min into properties. I've added verbs to the other parameter-free methods, but I'm willing to revert their names and make them properties, too, if that seems better.

Also, I'm now realising the contents of this PR would probably have been better introduced when they were used in the I/O rewrite, but unlikely to be worth changing now.

ketiltrout commented 1 year ago

I pushed the timestamp update into the schema. Downside is it only works in MySQL. Upside is don't have to re-implement ArchiveFileCopy.update, I guess.