PyPSA / atlite

atlite: A Lightweight Python Package for Calculating Renewable Power Potentials and Time Series
https://atlite.readthedocs.io
MIT License
278 stars 98 forks source link

Weather/climate data variable descriptions (for alternate model data use) #315

Open lfreese opened 1 year ago

lfreese commented 1 year ago

It would be helpful to have a description of the details for the weather and climate data (in addition to the table here): Screenshot 2023-08-09 at 11 37 19 AM

Detailed Description and Context

I am trying to use CESM data with atlite. I noticed that the development page discusses that at the moment capacity to build it out further beyond ERA5 is limited. I was hoping to see if more detail could be provided on the table shown above and what each of these variables is used for/a longer description of them? The reason for this is that CESM has different names. Specifically the ones I am trying to translate across that have unclear differences are:

  1. influx_direct could be a near IR direct solar flux or a visible direct solar flux
  2. influx_diffuse could be a near IR or a visible solar flux
  3. runoff could be a surface runoff or total liquid runoff
  4. height I am assuming to be the geopotential height in meters
  5. albedo could be a near IR or UV
  6. wnd100m I am using both U and V (zonal and meridional) winds

Additionally, how important is the time scale? I have some 3-hr, some 6-hr, and some daily data. My impression is that this will only influence the time scale for which I can get data output.

If you have any other advice on the attempt to use other weather data, I would certainly appreciate it, and am happy to discuss further!

euronion commented 1 year ago

Hi there!

Documentation of the variables we use from ERA5 can be found here: https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation

which is only half the truth, because we somehow have to use harmonised variable names between datasets, so e.g. influx_direct in atlite refers to fdir (Total sky direct solar radiation at surface](https://apps.ecmwf.int/codes/grib/param-db?id=228021) in the documentation.

Downloading and renaming of the variables happens in the dataset module for era5: https://github.com/PyPSA/atlite/blob/master/atlite/datasets/era5.py

You'll have to work your way through there, we don't have any other documentation right now. When working your way through the variables you will also notice that e.g. wnd100m is an aggregate from two ERA5 variables, namely the v100 and u100 components (like your CESM data :).

time scale: I also think it shouldn't be too important for cutout creation and only limit the resolution with which you can get the data output (emphasis on shouldn't: in principle, it is possible to create datasets with lower time resolution)

btw.: If you take notes while working your way through all the variables, maybe you can share them here so we can include them into the documentation? That would be fantastic!

I'm curious to learn more about the CESM dataset: Can you share some more information on it, like coverage (time, space) and what it offers that ERA5/SARAH don't offer, what it can do better than the other two?

lfreese commented 1 year ago

Thank you so much! This is exactly the type of information I was hoping for. I figured it might require a bit of digging, but couldn't quite pin down where to dig through, so this is appreciated.

I'll be documenting how I go through this, and eventually can share the scripts I create for processing the data to prep it for use in atlite.

CESM is a climate model that provides future projections, rather than reanalysis for historical data, which is the fundamental difference between the two. It is 1 degree spatial resolution, and the relevant variables seem to be anywhere between 6-hourly to daily resolution.