Closed ericatheresa closed 5 years ago
Thanks @ericatheresa. Tagging @VivianSihanZHENG to see if she can replicate. Three questions:
1) Do you know in which versions of the package these errors occurred? 2) Were you able to replicate these errors? 3) How many schools do we expect the second one to return if it's correct?
Thanks! Vivian please test these cases in the mean time and look into the issues.
Hi @ericatheresa @grahamimac , I tested in the STATA package Version 0.3.1 (latest version), and I was able to return the results without errors. For educationdata using "school ccd enrollment race", sub(grade=99 year=2015) csv
, 136080 observations are returned, and there are 5670 unique ncessch.
I think we caught this issue when running tests in the previous release, and Graham fixed it in version 0.3.1. I would recommend updating the package to the latest version, and try running them again. Thanks!!
Hi @VivianSihanZHENG -- that's what I get as well. The call should return about 100,000 schools and millions of observations. It seems to only be pulling the first five states, and quits halfway through California.
@VivianSihanZHENG I updated the package and that fixed the scorecard example (thanks!), but not the ccd example.
hi @ericatheresa, i checked in API using https://educationdata.urban.org/api/v1/schools/ccd/enrollment/2015/grade-99/race/, and 774488 records are returned. Does this record look correct to you?
@VivianSihanZHENG Yep, that sounds about right!
hi @ericatheresa, I checked the original CSV file, and filter by filter(grade==99, year==2015, sex==99)
, and it returned 774488 records with 96811 unique ncessch IDs, which matches with what's returned in API. So I believe the underlying CSV file is correct. I will then check the STATA package programs. @grahamimac Will get it back to you asap!
hi @ericatheresa, I found that the CSV file in the S3 bucket was not complete (where the STATA package csv option grabs the data), which causing fewer rows in the STATA package. The issue has been automatically resolved after I re-uploaded the CSV file to S3. And I think I was looking at the wrong bucket earlier. Sorry about the confusion! You could test it in STATA now. Please let me know if you see anything else!
The other thing I noticed is that the results from educationdata using "school ccd enrollment race", sub(grade=99 year=2015) csv
still keep all values for the variable sex
. Does this look correct to you?
Thanks!!
Thanks, @VivianSihanZHENG ! Works great now. Following up separately on issue #2.
A few people have recently encountered (possibly unrelated) errors using the csv option. @grahamimac @ddorio
One example is: educationdata using "college scorecard student-characteristics aid-applicants", sub(year=2014) csv
Error message:
variable count_total_FAFSA_applicants not found stata(): 3598 Stata returned error labelcsv(): - function returned error downloadcsv(): - function returned error getalldata(): - function returned error