OCHA-DAP / ocha-anticipy

Python package to support the development of anticipatory action frameworks
https://github.com/OCHA-DAP/ocha-anticipy
GNU General Public License v3.0
8 stars 1 forks source link

Longitude Offset in CHIRPS Data #158

Open PaulineNimo opened 1 year ago

PaulineNimo commented 1 year ago

@castledan, as we had discussed.

When working with the monthly CHIRPS data in the NetCDF files, the longitude values have an offset of 0.025. For example, for the Ethiopia file, the centroid should be x.x25/x.x75 but it is rounded down to something similar to x.x5.

Also, when naming the variables in the NetCDF files, would it be advisable to have one name for precipitation? The monthly one has precipitation while the daily files have prcp.

caldwellst commented 1 year ago

Nice catches Pauline. Regarding the precipitation names, definitely think it would be nice to have coherent variable names across the module!

castledan commented 1 year ago

Nice catch Pauline, I'll check this soon.

turnerm commented 1 year ago

Thanks for flagging this Pauline. @castledan do you think this could be due to querying the API with incorrectly rounded coordinates? I had the same problem with GloFAS.

caldwellst commented 1 year ago

Okay, so I've investigated this, and I don't think there is any issue. According to the documentation, the range of the monthly Y coordinates actually isn't with centroids at x.x25/x.x75 but every 0.05.

image

That's in contrast to the X coordinates for the monthly data, and both the X and Y coordinates for daily.

image

Which, to be honest, is quite confusing. I did notice that there are floating point differences (seemingly coming from the raw download) in both of the datasets for the coordinates, but not sure if this is something we need to worry about. See below the monthly floating points errors for the X coordinate:

image

And daily floating point errors for the Y coordinate (none in the X):

image

I generated this just by running some of the examples. I explored the code below with different formulations of the API call, including rounding coordinates and adjusting them, and the return was robust to different attempts (such as trying to pass the exact centroids, for instance). If we wanted to fix the floating points, would just need to round the coordinates during processing. However, I've not used this before, so wondering if there is something that could go wrong or if that otherwise might not be desirable? Thoughts?

from ochanticipy import create_country_config, CodAB, GeoBoundingBox
from ochanticipy import ChirpsDaily, ChirpsMonthly
import datetime

country_config = create_country_config(iso3="bfa")
codab = CodAB(country_config=country_config)
codab.download()
admin0 = codab.load(admin_level=0)
geo_bounding_box = GeoBoundingBox.from_shape(admin0)

start_date = datetime.date(year=2007, month=10, day=23)
end_date = datetime.date(year=2007, month=10, day=24)
chirps_daily = ChirpsDaily(
   country_config=country_config,
   geo_bounding_box=geo_bounding_box,
   start_date=start_date,
   end_date=end_date
)
chirps_daily.download()
chirps_daily.process()
chirps_da = chirps_daily.load()
chirps_da
caldwellst commented 1 year ago

Note that it appears to me the issue of variable names is taken care of in both the ChirpsDaily._process() and ChirpsMonthly._process() methods, so can't reproduce that issue, see here and here.

PaulineNimo commented 1 year ago

Thanks for checking. I don't think the floating point differences are an issue and can easily be rounded up/down during processing if need be. The offset would be an issue for point coordinates. When assigning cells to points and having that offset, there could be a shift in the cells assigned. It can also affect which cells are included during aggregation for administrative areas. I will check with this source. Just to make sure that the monthly data has the same shift.

castledan commented 1 year ago

Thank you for checking this @caldwellst, I've been wanting to do it forever. @PaulineNimo, let me know if you still find something wrong, both in the coordinates and in the variable names.