PyPSA / atlite

Atlite: A Lightweight Python Package for Calculating Renewable Power Potentials and Time Series
https://atlite.readthedocs.io
264 stars 89 forks source link

Add heuristic for ERA5 download chunk sizes #252

Open euronion opened 1 year ago

euronion commented 1 year ago

ERA5 cutouts are currently being downloaded as time=yearly slices (after #236 on time=monthly slices) to avoid requesting too large data pieces from the ERA5 backend. Monthly retrieval could theoretically negatively affect the cutout preparation speed. We could emply a heuristic to check for the request size and then decide based on the size whether to use monthly or yearly retrieval.

See discussion here: https://github.com/PyPSA/atlite/issues/236#issuecomment-1185474539_

johhartm commented 6 months ago

@euronion Stumbled upon the same issue and adapted the timeframe to optmize for my usecase (small cutout but long timeframe). I added a heuristic to optimise the requests to be as large as possible while staying within the 120.000 fields limit. However, I don't know how to account for the size limit with cutouts for large areas. If someone could help me with this information, I might be able to implement this feature.

euronion commented 6 months ago

Hi @johhartm , Thanks for the initiative. I would assume an approach of estimating the number of fields through

resolution * range latitude * range longitude * number of time steps * variables within the request

should be good for a heuristic.

Where did you get the 120.000 fields number from? It is the first time I hear about a concrete number + it seems a bit small, but that might depend on the definition of what a "field" is.

johhartm commented 6 months ago

I got this number from playing with creating larger requests and have them failing with the error message that the request was to large and the maximum request size is 120.000 fields. For me, the heuristic number of time steps * variables within the request worked, but only downloaded data for a pretty limited spatial frame. However, I start to think that the spatial extend does not affect the "field size", but still might to be taken into account to prevent the file size per request from getting to large. I will test this hypothesis with some larger cutouts and will get back when I have some results.