cmu-delphi / delphi-epidata

An open API for epidemiological data.
https://cmu-delphi.github.io/delphi-epidata/
MIT License
100 stars 68 forks source link

Find all signals for a location #1471

Open melange396 opened 3 months ago

melange396 commented 3 months ago

We may want to be able to answer the question: "What [other] signals are available for this geographic location?"

That is, when someone is looking at data for a particular location, they may be interested in finding other signals available for the same place, (for example) to do comparison/contrasting. Another use case is for a public health official, responsible for some region, who wants to see all the data available and relevant to them. The answer to the question should not just be "other signals that also use the same geographic type/level/resolution (like 'county')", but more specifically, "other signals that have actual data points for the exact same geographic entity (like 'Allegheny County')". Many signals have coverage across most or all locations of their representative geo_types, but some signals are specific to certain regional subsets, and some signals do not have data for some geo_values due to thresholding cutoffs or reporting practices.

The opposite question ("What [other] locations are available for this signal?") should already be easy[-ish] to answer by querying the database in a way that uses the (signal, geo, ...) prefix of an existing index: https://github.com/cmu-delphi/delphi-epidata/blob/c1551afdbfaf105c760edbffe3c7c767f6562016/src/ddl/v4_schema.sql#L57

Answering the question at hand may be as simple as adding a new (geo, signal) index, or we might consider maintaining another (new) table that stores these relations and which is kept up to date during acquisition. In either case, the information should be made available through a new API endpoint, or possibly by adding a special argument to an existing endpoint.

I believe this can be done across "all time" fairly easily, but another layer of usefulness (and complexity) would be to be able to answer for a time range or perhaps just a particular point in time.