spacepy / dbprocessing

Automated processing controller for heliophysics data
5 stars 4 forks source link

Support inspector writing to file #61

Open jtniehof opened 3 years ago

jtniehof commented 3 years ago

It would be nice to have some support for an inspector updating the file, particularly when a file has been freshly-created by dbprocessing and if the inspector has access to all the verbose provenance and other things for the file. Since the inspector is file-format-aware, and is basically the point of interface between dbp and the file format, it's a great way to get the dbp information into the file instead of just the database.

Proposed enhancement

Explicit support (and documentation) for the inspector placing dbp-related information into a file on inspection. Need to make sure that e.g. checksum implications are handled.

Alternatives

What we're doing right now is having the actual processing codes populate metadata with the name of all the input files, so this is duplicate code that has to be in every processing code.

OS, Python version, and dependency version information:

Linux-4.4.0-98-generic-x86_64-with-Ubuntu-16.04-xenial
sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0)
sqlalchemy=1.0.11

Version of dbprocessing

Current master from github (734f37b1bfb3540f5682edd6dbb2e590eb51a3ff)

Closure condition

This issue should be closed when appropriate design is chosen, implemented, and merged.

jtniehof commented 3 years ago

When the inspector is called after a file is made by dbp (#12 relates), the inspector gets a Diskfile object which includes basically everything in the file record. As long as this is populated before the inspector call, there's quite a bit in there.

balarsen commented 3 years ago

This would be a good addition, not sure how to handle it as the inspector doesn't get info from the chain, just the file. Could make an inspector temp file for each file created that could be looked at... the abvoe comment on Diskfile is likly true if @jtniehof says it. I thought it was decoupled.

jtniehof commented 3 years ago

The inspector "reports back" by populating stuff in the Diskfile object, so just need to check on what order things are done in. It might be worth making sure the inspector can tell if it's a newly created file or ingesting one from scratch (verbose provenance might be enough information.)