Open cernst122 opened 1 year ago
For the current list of 'Confirmed Uploads' in #21, it appears that all of the DCIDs are essentially composed of arrays of mutually exclusive
categorical variables, ie., <prefix>_<category A>_<category B>_ ...
, where each category can have 0 or more variables.
I'm not seeing DCIDs, regarding intersections of multiple categories, does this mean we're summing multiple DCIDs? For instance, if we want to see all genders graphed, I'm assuming we'll make a call for each and
sum the counts for the visualization- is this accurate?
That's right, we won't see intersections across e.g. GenderEnums because they are mutually exclusive. DC team recommends the series API call on entity country/USA
; we'll make separate calls for each separate permutation of variables as you said.
More info on the limitations of the series API is in #14. I don't think you should run into the bug referenced there because we'll be working entirely with our new dcids.
Each data set from our final list supports a number of overlapping filters. For example, table 1 supports race, sex, and degree level. Therefore the user may query:
Total_{EducationalAttainment}_{Gender}_{Race}
data where any variable may benull
. Some examples includeWe need to associate all possible filter combos with the corresponding DataCommons dcid. Most of the dcids we care about for the dashboard are not yet present in DataCommons; we're adding them as part of this project. The new dcids can be found in the processed_csv tab of a completed sheet (example).
Typically our new variables follow deterministic patterns using English words associated with the variables, e.g.
Count_Person_ScienceAndEngineeringRelatedMajor_EducationalAttainmentDoctorateDegree_HispanicOrLatino_Male_Tenured
. However, other variables already in DataCommons sometimes use unique identifiers such as dc/t7403chwvspm (Bachelors Degree or Higher, Female, Black or African American Alone.) For this reason we may need to map filter selections to hardcoded dcids rather than always dynamically generating dcids based on filter selections.DataCommons enforces alphabetical order when constructing dcids (e.g.
bachelorsDegreeMajor < educationalAttainment < ethnicity < gender < tenureStatus
in the example above).