Visualizing the spread and relative risk of Covid19 at the local level. Almost all the Covid19 maps I've seen show just the absolute number of cases in a location, but that's not the most important metric. 50 cases in Kentucky isn't the same as 50 cases in NYC.
We pull in data on population/age/sex, # of hospital/ICU beds, comorbidity prevalence, and neighboring counties/states to show the relative risk level so that people can better understand what their community is up against and plan accordingly.
let API_KEY_MAPBOX = '<yourMapboxApiKey>'
All the code for pulling the latest Covid19 data is stored in the build
folder. You can update the data by manually running the Jupyter Notebook 00_add_dynamic_data.ipynb, or by converting it to a python script with jupyter nbconvert --to script 00_add_dynamic_data.ipynb
which will create a python
script that can be run by calling python 00_add_dynamic_data.py
(Python 3 only). This will grab the latest data and output it to data/states.json
and data/counties.json
where it can be read/used.
Running the script will both display information about the data being pulled/processed, and store it to build/logs
. The output messages are sent to build/logs/message_logs
to be used for debugging later. A copy of the geoJSON/covid data, with timestamp, is output to build/logs/data_logs
.
counties.json
and states.json
are geoJSON files that include both the geometries of state/county boundaries, and statistics such as covid19 cases, population, and time_series data. The format can be a bit confusing and we will probably separate them out in a future version. For now, assuming you read counties.json
into a variable countyData
and read states.json
into statesData
, it will be structured as follows.
countyData["features"]
is an array of all "features" in this case counties. Each feature has an attribute properties
where the data is contained and geometry
where the boundaries are contained. countyData["features"][0]["properties"]
will be the properties of the 1st county and include keys like cases
, deaths
, population
, risk_local
, risk_total
...etc 0500000US36081
but the relevant portion is the last 5 digits (commonly called the FIPS), in this example 36081
, 36 represents the state "New York" and 081 the county "Queens". Since counties in different states can have the same name, it is important to use geo_id or fips. Here is a list of FIPS codes for the US although you shouldn't need to use it as county and state names are included in the properties
The risk calculation algorithm is currently quite simple. We plan to add to it while still keeping it fully explainable. Currently... -Local Risk is just cases per capita for the given region -Nearby Risk is a factor of the number of cases and population of nearby counties, exponentially decayed as a factor of distance. Every 50km further away a county is, it's risk is halved, up to a max of 100km. We would like to expand this to use transportation flows instead of distance, but as they are changing rapidly due to lockdowns and social distancing, we have not yet found a way to do this so we choose to use distance as a simple proxy.
We are looking for contributors of all skill levels. CovidCompare is built with leaflet and vanilla javascript, with a bit of data massaging using Python.
We are in need of
The best way to get started is either to open an issue about something you would like to work on, commenting on an existing issue, or by emailing me following the link in the bottom left hand page of the CovidCompare website if you want to help but aren't sure exactly where to get started.
All of our data is, and will remain, open-source and free to use. Here are the sources we use: