UrbanInstitute / education-data-package-r

https://urbaninstitute.github.io/education-data-package-r/
Other
86 stars 11 forks source link

Zeros instead of NA #116

Open dcaud opened 2 months ago

dcaud commented 2 months ago

The release notes from NCES say that the 2022 CCD doesn't include enrollment by race for LAUSD. I think their source files code those as missing, NA, or something. However, I see these coded as zeros in this API. Should they be NA?

See below for a comparison between LA and San Francisco as an example of what I think should be missing as opposed to zeros.

library(educationdata)
ca.3rd.grade.race <- get_education_data(level = "schools",
                           source = "ccd",
                           topic = "enrollment",
                           filters = list(year = 2022, grade = 3,
                                          fips = 6),
                           subtopic = list("race"))

# LAUSD
ca.3rd.grade.race %>%
  filter(leaid == "0622710") %>%
  filter(race != 99) %>%
  summarize(enrollment.sum = sum(enrollment, na.rm = TRUE),
            num.na = sum(is.na(enrollment), na.rm = TRUE))

#San Fran
ca.3rd.grade.race %>%
  filter(leaid == "0634410") %>%
  filter(race != 99) %>%
  summarize(enrollment.sum = sum(enrollment, na.rm = TRUE),
            num.na = sum(is.na(enrollment), na.rm = TRUE))
erika-tyagi commented 1 month ago

@dcaud - thanks for the note! Flagging this for @JCarterUI and @LRURBAN.