UrbanInstitute / education-data-package-r

https://urbaninstitute.github.io/education-data-package-r/
Other
86 stars 11 forks source link

Wrong lat/lon for some schools #112

Closed dcaud closed 8 months ago

dcaud commented 8 months ago

The lat/long is incorrect in 2004 for QUEEN CITY ACADEMY CHARTER SCHOOL in NJ.

queenAnn2004 <- get_education_data(level = "schools",
                   source = "ccd",
                   topic = "directory",
                   filters = list(year = 2004,
                                  ncessch = '340006100378'),
                   add_labels = TRUE)
queenAnn2004$longitude
queenAnn2005 <- get_education_data(level = "schools",
                                   source = "ccd",
                                   topic = "directory",
                                   filters = list(year = 2005,
                                                  ncessch = '340006100378'),
                                   add_labels = TRUE)
queenAnn2005$longitude

It is correct in 2005 and for most other years.

I believe this is not the only school that has incorrect lat/lon, but I haven't identified a pattern yet.

Is this a problem with the source data from the Dept of Ed? Can it be fixed?

erika-tyagi commented 8 months ago

Hi @dcaud - thanks for flagging! I'll report this back to the team responsible for geocoding and have them circle back with more information.

LRURBAN commented 8 months ago

Hi @dcaud

So I think this is an issue with the underlying CCD data. I checked the raw data file, and saw that it matches the incorrect address you flagged. I don't think we currently have plans to over-write this error in this endpoint (CCD directory files), BUT we do have geo-coded coordinates for all schools in the NHGIS endpoint, which you can call in R using the following command: library(educationdata) data <- get_education_data(level = "schools", source = "nhgis", topic = "census-2010", filters = list(ncessch = "340006100378")) This data call will give you a suite of geo-matched information (like census place, region, etc.) but most importantly will include both the CCD reported longitude/latitude and our geo-coded longitude and latitude (geo_longitude and geo_latitude, respectively). You should be able to select the variables of interest and merge 1:1 using ncessch and year, accordingly.

dcaud commented 8 months ago

Thanks for checking the raw data! And thanks for offering an alternative. At least for this school, those geo-codes are better!

LRURBAN commented 8 months ago

Can we close this issue?