hubverse-org / hubUtils

Utility functions for Infectious Disease Modeling Hubs
https://hubverse-org.github.io/hubUtils/
Other
6 stars 3 forks source link

function to load model metadata #111

Closed elray1 closed 1 year ago

elray1 commented 1 year ago

it would be nice to have a function to load model metadata. there is some code here that could be borrowed/adapted for this.

elray1 commented 1 year ago

Some suggestions:

Inputs:

Returns:

Logic:

elray1 commented 1 year ago

to test, add some example model metadata files to one of the test hubs in inst/testhubs. Would be good to get some complicated examples:

May be able to pull some from here

annakrystalli commented 1 year ago

A good function to base the functionality around would be the hubUtils::read_config() function and adapt it to read yaml https://github.com/Infectious-Disease-Modeling-Hubs/hubUtils/blob/main/R/read_config.R

It consists of two methods, one default and one that works with cloud file systems like S3 buckets.

annakrystalli commented 1 year ago

Quick note of part of this suggested code too: https://github.com/reichlab/covidHubUtils/blob/7258bc1b146906b31e9d31d19fd13cf73259b5a0/R/get_model_metadata.R#L56-L65

purrr::map_dfr() is now deprecated in favour of purrr::map() %>%lpurrr::list_rbind()

lshandross commented 1 year ago

I was thinking of including was automatically merging team_abbr and model_abbr fields or splitting the model_id field using the functions from hubUtils, but I wanted both of your input on some the specifics. I could see this functionality being implemented in one of three ways:

  1. Merging the team_abbr and model_abbr fields when applicable and only keeping the single model_id field in the resulting table of metadata.
  2. Splitting the single model_id field when applicable and only keeping the team_abbr and model_abbr fields
  3. Keeping all three fields but filling in any null values by either merging or splitting the appropriate field(s)

The third option might be a little redundant but there is an argument for it in order to preserve all the fields in the original metadata files. Or we could not include this functionality. What are each of your thoughts?

elray1 commented 1 year ago

I also see option

  1. keep whatever fields were specified by the hub.

I vote for either option 3 or option 4. In favor of option 3, there's something to be said for just standardizing outputs across hubs, and there are situations where it is more helpful to be able to grab the model_id field and situations where it is more helpful to be able to grab the team_abbr field.

If we went with option 4, any functions that needed access to one of these could call whatever function we have to standardize outputs before trying to access it, but that seems like making extra work for ourselves down the line. So in the end I think I vote for 3