cmu-delphi / epidatr

Delphi Epidata API R Client
https://cmu-delphi.github.io/epidatr/
Other
1 stars 5 forks source link

Convert geo abbreviations to/from names to/from codes #143

Open dshemetov opened 1 year ago

dshemetov commented 1 year ago

One of the original PRD feature requests in #13 was to have a little bit of geomapping in this package. Should we? @lcbrooks @dsweber2

brookslogan commented 1 year ago

What all does geomapping include?

Included population data for msa, county, state

Convert geo abbreviations to/from names to/from codes

One/both of these two? Anything else?

Generally it feels like geomapping should be in its own package (delphi-utils port/wrapper, gtsibble, or something else), and incorporated here only when it's not going to pollute the namespace. However, there are a couple things that seem like they belong here:

  1. Information on county FIPS definitions we use.
  2. Population data we use.

Things that don't belong in epidatr unless we can incorporate them into objects for the above:

  1. Conversions from and to Hub geo codes. (A frequent annoyance if forecasters use our package?)

The complication is that how we provide 1. and 2. above would change depending on the interface of the geomapping package we would work with. E.g., do we need to define a custom subclass? Conversions to and from some standard class? Something else? And a really nice, general geomapping package would be a ton of work; I'm not sure if one exists.


Musings on use cases.

A few situations to think about:

  1. I just want Dephi <-> Hub state&national code conversion functions.
    • Separate package would work, avoid polluting the epidatr namespace, and be useful on its own, but would be less discoverable by users.
  2. I want to know what geomapping Delphi used, in situations where there's not a stable standard.
    • County FIPS codes. I think our definitions didn't quite match JHU-CSSE's and we had to do some conversion, right? And the codes' meanings change across times as counties change; are we keeping in sync?
    • Population data. I don't know when we're using which population estimates. Might be an API functionality thing rather than a client package functionality thing to provide these.
    • This seems the hardest to live in another package. A package dedicated to providing geomappings used by stuff in the Delphi Epidata API... seems like it should be part of the API package. We could implement something more general allowing definitions of various possible geomappings, and have epidatr have an object/class specifying its geomapping. That might pare down the number of objects/functions we need to provide.
  3. Converting to and from lower-case state abbreviations (which nothing else uses?) is annoying. It'd be nice to have the client input and output whatever format I want automatically. (This may be impossible/inadvisable for fips <-> county names though.)
    • Automatic stuff, if possible, could be through another package which we import.
    • Again, a more general geomapping package might be useful here, and we could support its classes as input.

Musings on long-term solution: this feels like it will be facilitated through some custom classes representing geo vectors, which we would use in this package and in epiprocess.