CDCgov / pyrenew-hew

Models and infrastructure for forecasting COVID-19 and Flu hospitalizations using wastewater data with PyRenew
Apache License 2.0
5 stars 0 forks source link

A metadata approach to fitting directories? #156

Open SamuelBrand1 opened 2 days ago

SamuelBrand1 commented 2 days ago

At the moment we are gathering important information about inference/score runs based on specific locations in a directory structure, and parsing the dir name,

e.g.

https://github.com/CDCgov/pyrenew-hew/blob/b90dd9f82e323ecaf365753c0637a602f7ee7d50/hewr/R/directory_utils.R#L81-L93

This will work well for us at the moment, but I think its inflexible in the medium term because it hard encodes the expected directory structure.

I'm personally against encoding information in dirnames, I'd prefer an autogenerated set of metadata for runs either as a .toml file that gets generated as the inference is done or handled by a work scheduling tool e.g. something like mlflow. I might well be missing something though.

damonbayer commented 17 hours ago

This could help with some of the naming collisions I mentioned in https://github.com/CDCgov/pyrenew-hew/issues/126

damonbayer commented 17 hours ago

Related to https://github.com/CDCgov/pyrenew-hew/issues/62

dylanhmorris commented 16 hours ago

I agree and favor a .toml file