CoronaNetDataScience / corona_tscs

This is the raw data repository (policy record format) of the CoronaNet project on government responses to the COVID-19 pandemic.
98 stars 57 forks source link

Suspicious frequency distribution of the init_country_level variable #17

Closed ingonader closed 4 years ago

ingonader commented 4 years ago

The init_country_level seems has five levels:

init_country_level n
Municipal 744
National 8347
No, it is at the national level 118
Yes, it is at another governmental level (e.g. county) 2
Yes, it is at the province/state level 4680

To me, they seem to stem from two different "sets" of levels: "Municipal" vs. "National" (mun/nat), and another distinct set where the related question in the RA questionnaire was a "yes/no" question. If this is the case, something seems odd in the data distribution here: For the mun/nat-set, the vast majority of policies seem to be on national level (which makes sense, as local outbreaks with first policies are followed by a comprehensive number of national policies in a lot of countries). On the other hand, in the yes/no set of responses, the majority of the policies seem to be on province/state level (and not on national level). Hence, depending on the answering format, the pattern seems to be reversed.

This pattern can also be found within some type categories of policies, here are some examples:

type init_country_level n perc
Closure and Regulation of Schools Municipal 97 6.7
Closure and Regulation of Schools National 722 49.7
Closure and Regulation of Schools No, it is at the national level 18 1.2
Closure and Regulation of Schools Yes, it is at the province/state level 617 42.4
type init_country_level n perc
Health Monitoring Municipal 19 5.4
Health Monitoring National 232 66.5
Health Monitoring No, it is at the national level 10 2.9
Health Monitoring Yes, it is at the province/state level 88 25.2
type init_country_level n perc
Health Resources Municipal 82 3.1
Health Resources National 1655 61.7
Health Resources No, it is at the national level 6 0.2
Health Resources Yes, it is at the province/state level 940 35
type init_country_level n perc
Health Testing Municipal 14 4.1
Health Testing National 205 60.5
Health Testing No, it is at the national level 2 0.6
Health Testing Yes, it is at the province/state level 118 34.8
type init_country_level n perc
Other Policy Not Listed Above Municipal 41 4.3
Other Policy Not Listed Above National 678 71.9
Other Policy Not Listed Above No, it is at the national level 3 0.3
Other Policy Not Listed Above Yes, it is at the province/state level 221 23.4
type init_country_level n perc
Public Awareness Measures Municipal 27 4.1
Public Awareness Measures National 418 63.1
Public Awareness Measures No, it is at the national level 3 0.5
Public Awareness Measures Yes, it is at the province/state level 214 32.3
type init_country_level n perc
Quarantine Municipal 73 6.3
Quarantine National 746 64.4
Quarantine No, it is at the national level 22 1.9
Quarantine Yes, it is at the province/state level 318 27.4
type init_country_level n perc
Restriction and Regulation of Businesses Municipal 106 5.8
Restriction and Regulation of Businesses National 853 46.3
Restriction and Regulation of Businesses No, it is at the national level 18 1
Restriction and Regulation of Businesses Yes, it is at the province/state level 866 47
type init_country_level n perc
Restriction and Regulation of Government Services Municipal 11 2.5
Restriction and Regulation of Government Services National 201 45.5
Restriction and Regulation of Government Services No, it is at the national level 7 1.6
Restriction and Regulation of Government Services Yes, it is at the province/state level 223 50.5
type init_country_level n perc
Restrictions of Mass Gatherings Municipal 44 6.5
Restrictions of Mass Gatherings National 380 56.4
Restrictions of Mass Gatherings No, it is at the national level 2 0.3
Restrictions of Mass Gatherings Yes, it is at the province/state level 248 36.8
type init_country_level n perc
Social Distancing Municipal 48 8.2
Social Distancing National 312 53.6
Social Distancing No, it is at the national level 3 0.5
Social Distancing Yes, it is at the province/state level 219 37.6

To me, this seems strange. Please decide if this is worth investigating.

ingonader commented 4 years ago

Another indication that something is amiss here are the target_region, target_province, and target_city variables. The table below shows the percentage of non-missing data across the whole dataset. The "National" level has a very low percentage of non-missing data in the target_province variable (which I think is expected), but the "no, national level" level has a relatively high percentages of non-missing data in that category:

init_country_level target_region target_province target_city
Municipal 0.1% 1.9% 21.4%
National 1.6% 2.9% 1.9%
No, it is at the national level 1.7% 14.4% 2.5%
Yes, it is at another governmental level (e.g. county) %0 0% %0
Yes, it is at the province/state level 0.5% 21.9% 1.6%

(The municipal vs. province/state levels make a whole lot more sense to me now, looking at this table).

timothymodel commented 4 years ago

Should be corrected in the most recent data releases