UrbanInstitute / education-data-package-stata

MIT License
19 stars 4 forks source link

ncessch_num in Stata package #57

Closed mchingos closed 4 years ago

mchingos commented 5 years ago

For CCD school directory data, Stata always (by default) loads ncessch_num variable that is always missing.

grahamimac commented 5 years ago

Thanks for finding this Matt. I can view it in the API here as an example: https://educationdata.urban.org/api/v1/schools/ccd/directory/1988/. It looks like the ncessch_num exists and is a float/double in the API (ends with .0), while it is marked as an integer data_type in the varlist: https://educationdata.urban.org/api/v1/api-endpoint-varlist/?endpoint_id=24.

@ddorio and @VivianSihanZHENG can you look into this for the next release if you have a chance, and see if it's something we need to add to our automated checks, or already have in place?

ddorio commented 5 years ago

@VivianSihanZHENG @grahamimac @mchingos @ericatheresa

It is stored as a double in the database so it's coming over from stata, or python is creating translating it, as a float. I'll change the API so it's always formatted as an integer, but we should make sure it's stored that was as well. I'll move this issue over to the API issue cue as well.

ddorio commented 5 years ago

@VivianSihanZHENG @grahamimac @mchingos @ericatheresa

I just checked this and it is a double on the STATA file. We can transform it with Python and set the endpoint to display it as an integer, but it should probably be changed to an integer on the STATA file too. Also, I think year is a double on the STATA file too. I think our goal should be to make sure the STATA files (aka, the Source) have the types we want the data to be and the Python and API code can ensure that it is displayed that way.

grahamimac commented 5 years ago

Note that the updates to the Stata package and the recent update to switch from double to int did not fix this issue. Marking as open to fix in the next release of the Stata package.

VivianSihanZHENG commented 4 years ago

Resolved by changes made in the new release of the package (version 0.3.6)