At the moment we are gathering important information about inference/score runs based on specific locations in a directory structure, and parsing the dir name,
This will work well for us at the moment, but I think its inflexible in the medium term because it hard encodes the expected directory structure.
I'm personally against encoding information in dirnames, I'd prefer an autogenerated set of metadata for runs either as a .toml file that gets generated as the inference is done or handled by a work scheduling tool e.g. something like mlflow. I might well be missing something though.
At the moment we are gathering important information about inference/score runs based on specific locations in a directory structure, and parsing the dir name,
e.g.
https://github.com/CDCgov/pyrenew-hew/blob/b90dd9f82e323ecaf365753c0637a602f7ee7d50/hewr/R/directory_utils.R#L81-L93
This will work well for us at the moment, but I think its inflexible in the medium term because it hard encodes the expected directory structure.
I'm personally against encoding information in dirnames, I'd prefer an autogenerated set of metadata for runs either as a
.toml
file that gets generated as the inference is done or handled by a work scheduling tool e.g. something likemlflow
. I might well be missing something though.