Rewrite 6/14: The "import-detect" Extension and removal of Type and Info classes

This PR is concerned with implementing the "file import detection" framework for the daemon. I think this is the last of the "structural" PRs. Subsequent PRs in this rewrite will deal primarily with changes to I/O code.

Motivation

The ultimate goal of this PR is to produce the infrastructure needed by the CHIME alpenhorn extensions over in alpenhorn-chime.

This PR does the following:

removes AcqType, FileType and all the Info class framework (all moved to alpenhorn-chime).
replaces the type/info-based detection system with a simple function call plus callback scheme provided by third-party extensions (like alpenhorn-chime)
moves the old example type extension alpenhorn/generic.py to examples/pattern_importer.py, and updates it to work with this new system

Removal of the Info framework

I've removed all reference to info classes from alpenhorn. They were an integral part of alpenhorn-1, but in alpenhorn-2 they served two purposes:

they had a detect function which was passed an acq or file name and returned True or False to indicate a file which needed to be imported by alpenhorn
because they were peewee models, when importing a file, alpenhorn would also call a new class method to generate a new record in these tables.

The first of these features has been replaced by a new "import-detect" extension type which provides a simple function which will perform the detection step of the import. See the "The 'import-detect' Extension" section below.

The second of these functions is replaced with an optional post-import hook, which removes the awkwardness of requiring alpenhorn to add rows to tables it knows nothing about. See the "The post-import Callback" section below.

Removal of `AcqType` and `FileType`

While CHIME makes heavy use of AcqType and FileType to manage our data, in alpenhorn their use was solely to determine which Info tables were available to perform import detection. With the removal of info classes, they no longer have a use in alpenhorn. I've moved them to alpenhorn-chime where they've been re-implemented (like the ArchiveInst table was).

Removal of these two tables also means the acq_types and file_types extensions are no longer needed, and they have been removed from extensions.py, as well as the register_type_extensions call that was being made in service.py.

The "import-detect" Extension

In place of all the above is a new extension type called "import-detect". Each "import-detect" extension returns (via register_extension) a single callable object, which is the "detect" function used during file import.

(This is exactly what alpenhorn-chime is: an alpenhhorn import-detect extension.)

The detect function is passed a pathlib.Path pointing to the file to import and the UpdateableNode containing the candidate data file. The function must determine if the path points to a file that alpenhorn should import. It must return a 2-tuple:

The first element is either None to indicate detection failed (i.e. the path does not point to a data file needing import), or else it's the acquisition name, which must be a portion of the path passed in (with the file name becoming the remaining portion of the path).
The second element is an optional callback function which will be called after the file has been imported (i.e. after ArchiveAcq ArchiveFile, ArchiveFileCopy records have been generated for the file. If no post-import callback is needed, this may be None. (In implementing this new system, I've discovered functools.partial to be a useful thing to retrun to alpenhorn as a callback because it allows the detect function to pass data to a callback.) See the following section for more details

Multiple "import-detect" extensions may be loaded. In that case, the import code tries each in extension order until one of them reports a successful match.

alpenhorn will run without any "import-detect" extensions loaded, but will be unable to import files in that case. (Attempts to import files will result in an error message).

The Post-Import Callback

When provided, alpenhorn will pass to the callback the following parameters:

the ArchiveFileCopy of the newly imported file
if this import created it, the ArchiveFile for the file (or None if it already existed)
if this import created it, the ArchiveAcq for the file (or None if it already existed)
the UpdateableNode on which this import took place.

The ArchiveFile and ArchiveAcq of any imported file may be obtained via the ArchiveFileCopy; newly-created instances are passed to the callback so the callback knows when they are new or not. These two parameters could be replaced by booleans without loss of information, but I think it's more direct to do it like this.

The value returned from the callback is ignored.

The "pattern-importer" example extension

The regex/glob-based example extension formerly found in alpenhorn/generic.py (which was an example of an "acq_types"/"file_types" extension) has been moved to examples/pattern-importer.py and updated to be an "import-detect".

It's not used yet but this example extension will also eventually be used in the end-to-end test in tests/test_service.py.

Changes to `auto_import`

The changes here are somewhat performative: they are the code changes needed to use the new info class system, but the auto_import code doesn't really work yet within the new framework. A subsequent PR in this series will transition the import code to use the new task queue. As part of that, this code will get fixed. Despite that, it's good to make this change here to show how the changes to info classes affect the calling code.

radiocosmology / alpenhorn