barronh / pyrsig

Python interface to RSIG Web API
GNU General Public License v3.0
4 stars 2 forks source link

FAQSD Data for Census Tracts #6

Open paigea58 opened 1 month ago

paigea58 commented 1 month ago

Hi, I did a comparison of the faqsd.pm25_daily_average 2011 data (downloaded from here https://www.epa.gov/hesc/rsig-related-downloadable-data-files#input) to the data retrieved from the API. I noticed in the API version the last digit of the census tracts is not listed in the API data, whereas in the website version it is. For example, census tract 4820124110 is subdivided into 48201241101, 48201241102, 48201241103. The website data shows values for each of these. The API version does not. How can I use the API to get these values?

Code used:

years = range(2011, 2012)

dataframes = []

w_lon = -106.6 s_lat = 25.8 e_lon = -93.5 n_lat = 36.5

name = 'faqsd.pm25_daily_average'

for year in years: bdate = f'{year}-01-01' edate = f'{year}-12-31'

rsigapi = pyrsig.RsigApi(bdate = bdate, edate = edate, bbox=(w_lon, s_lat, e_lon, n_lat))

df = rsigapi.to_dataframe(name, withmeta = True)

dataframes.append(df)
barronh commented 4 weeks ago

Thanks for posting the issue. This is an RSIG issue that you're noticing through pyrsig. Using the RSIG URL api directly demonstrates that this is independent of pyrsig.[1]

It looks like the FIPS code is both being truncated AND used as an ID. If you looked at the whole US, you'd see 72,283 unique FIPS prediction rows per day in the downloaded files. In the RSIG api coverage, you'd see just 47,455 prediction rows per day. Interestingly, 47,455 is also the unique number of FIPS codes from the downloaded files if you first truncate them to 9 characters long.

I am checking with the RSIG developer and then will get back to you with a resolution when available.

[1] https://ofmpub.epa.gov/rsig/rsigserver?SERVICE=wcs&VERSION=1.0.0&REQUEST=GetCoverage&FORMAT=ascii&TIME=2011-01-01T00:00:00Z/2011-01-01T23:59:59Z&BBOX=-106.6,25.8,-93.5,36.5&COVERAGE=faqsd.pm25_daily_average&COMPRESS=0

barronh commented 3 weeks ago

The expert is on the case. It has been tracked down to a data type issue (32-bit int vs 64-bit int). The fix has to be thoroughly tested for impacts on other parts of the system before it is rolled out. I'll keep you posted.

barronh commented 1 week ago

Sorry about the wait... For now, you should just use the downloaded files. The RSIG update may take a bit. If you download the file, you can use it along with other pyrsig data by uses geopandas to open it.