GeoDaCenter / opioid-policy-scan

The Opioid Environment Policy Scan provides access to data at multiple spatial scales to help characterize the multi-dimensional risk environment impacting opioid use in justice populations across the United States.
13 stars 14 forks source link

Fix variable name disagreements #69

Closed mradamcox closed 11 months ago

mradamcox commented 12 months ago

There are a few disagreements between the variable names in data dictionaries and in the CSV files themselves. Creating this ticket to track their resolution ahead of the v2 release.

Ran a validation script against the table definitions (which are generated directly from the data dictionaries) and the CSVs themselves. Here is the output, i.e. the variable names that this ticket should address. Note that some of these have more to do with geometry fields, and should be fixed via #68.

VALIDATE INPUT SOURCE: csv/C_1980.csv
WARNINGS ENCOUNTERED: 0

VALIDATE INPUT SOURCE: csv/C_1990.csv
WARNINGS ENCOUNTERED: 2
  1 source columns missing from schema: Age15_24P
  1 schema fields missing from source: A15_24P

VALIDATE INPUT SOURCE: csv/C_2000.csv
WARNINGS ENCOUNTERED: 2
  1 source columns missing from schema: Age15_24P
  1 schema fields missing from source: A15_24P

VALIDATE INPUT SOURCE: csv/C_2010.csv
WARNINGS ENCOUNTERED: 2
  1 source columns missing from schema: VacP
  1 schema fields missing from source: VacantP

VALIDATE INPUT SOURCE: csv/C_Latest.csv
WARNINGS ENCOUNTERED: 1
  1 source columns missing from schema: Unnamed: 0

VALIDATE INPUT SOURCE: csv/S_1980.csv
WARNINGS ENCOUNTERED: 2
  2 source columns missing from schema: STATEFP, Age15_24P
  4 schema fields missing from source: G_STATEFP, GEOID, STUSPS, A15_24P

VALIDATE INPUT SOURCE: csv/S_1990.csv
WARNINGS ENCOUNTERED: 2
  2 source columns missing from schema: STATEFP, Age15_24P
  4 schema fields missing from source: G_STATEFP, GEOID, STUSPS, A15_24P

VALIDATE INPUT SOURCE: csv/S_2000.csv
WARNINGS ENCOUNTERED: 2
  3 source columns missing from schema: STATEFP, Age15_24P, OccP
  4 schema fields missing from source: G_STATEFP, GEOID, STUSPS, A15_24P

VALIDATE INPUT SOURCE: csv/S_2010.csv
WARNINGS ENCOUNTERED: 1
  4 schema fields missing from source: G_STATEFP, STUSPS, ChildrenP, Age18_64

VALIDATE INPUT SOURCE: csv/S_Latest.csv
WARNINGS ENCOUNTERED: 2
  3 source columns missing from schema: TotPopE, NoHSP, PrMisuse20
  3 schema fields missing from source: TotPop, NoHsP, PrMsuse20P

VALIDATE INPUT SOURCE: csv/T_1980.csv
WARNINGS ENCOUNTERED: 2
  3 source columns missing from schema: NoHSP, ChildrenP, OccP
  4 schema fields missing from source: TRACTCE, COUNTYFP, STATEFP, NoHsP

VALIDATE INPUT SOURCE: csv/T_1990.csv
WARNINGS ENCOUNTERED: 2
  4 source columns missing from schema: Age15_24P, NoHsp, ChildrenP, OccP
  5 schema fields missing from source: TRACTCE, COUNTYFP, STATEFP, A15_24P, NoHsP

VALIDATE INPUT SOURCE: csv/T_2000.csv
WARNINGS ENCOUNTERED: 2
  4 source columns missing from schema: Age15_24P, NoHsp, ChildrenP, OccP
  6 schema fields missing from source: TRACTCE, COUNTYFP, STATEFP, A15_24P, NoHsP, PciE

VALIDATE INPUT SOURCE: csv/T_2010.csv
WARNINGS ENCOUNTERED: 2
  2 source columns missing from schema: GiniCoeff, VacP
  7 schema fields missing from source: TRACTCE, COUNTYFP, STATEFP, AgeOv18, NonRelFhhP, NonRelNfhhP, VacantP

VALIDATE INPUT SOURCE: csv/T_Latest.csv
WARNINGS ENCOUNTERED: 2
  3 source columns missing from schema: TotPopE, NoHSP, FqhcMinDis
  3 schema fields missing from source: TotPop, NoHsP, MinDisFqhc

VALIDATE INPUT SOURCE: csv/Z_1980.csv
WARNINGS ENCOUNTERED: 2
  4 source columns missing from schema: ZCTA, Age55_59, Ov65P, PacIsP
  4 schema fields missing from source: GEOID, PacISP, HispP, Ovr65P

VALIDATE INPUT SOURCE: csv/Z_1990.csv
WARNINGS ENCOUNTERED: 2
  3 source columns missing from schema: ZCTA, Ov65P, PacIsP
  4 schema fields missing from source: GEOID, PacISP, HispP, Ovr65P

VALIDATE INPUT SOURCE: csv/Z_2000.csv
WARNINGS ENCOUNTERED: 2
  3 source columns missing from schema: ZCTA, Ov65P, PacIsP
  4 schema fields missing from source: GEOID, PacISP, HispP, Ovr65P

VALIDATE INPUT SOURCE: csv/Z_2010.csv
WARNINGS ENCOUNTERED: 2
  3 source columns missing from schema: PacIsP, MedInc, VacP
  2 schema fields missing from source: PacISP, VacantP

VALIDATE INPUT SOURCE: csv/Z_Latest.csv
WARNINGS ENCOUNTERED: 0