atrisovic / weather-panel.github.io

A practical guide to climate econometrics: Navigating key decision points in weather and climate data analysis.
https://climateestimate.net
Creative Commons Attribution Share Alike 4.0 International
35 stars 11 forks source link

JOSE Review - comments on Weather and Climate Data Chapter #77

Open kls2177 opened 1 year ago

kls2177 commented 1 year ago

Overall, very thorough but not too overwhelming. I appreciate the learning objectives at the outset and I feel that they align well with the content provided. I also appreciate that the code snippets are provided in several languages.

In the Introduction, the authors use a visual to motivate the section and engage students. I really liked this “engagement trigger” approach. Do you think a similar approach could be used to start off all chapters? For this chapter there are many different options for motivating visuals. One example could be a time series of the volume of weather the climate data available (e.g. from the NASA Earth Science and Data Systems: https://www.earthdata.nasa.gov/s3fs-public/2023-01/product-distribution-volume-discipline-2.jpg?VersionId=Tor97BJIz5dyuZofS5swA7RGwdccByVe )? This is just a suggestion.

Another general note about variability. When I have worked with students who are unfamiliar with weather and climate data, they are often surprised at how noisy the data is (even though they experience it everyday!). In the Hands-On Excercise, Step 1 section, it might be useful to ask students to reflect on the components of variability - is there a trend? a clear seasonal cycle? other low and high-frequency variability? This could be done by asking them to plot a time series of a single grid point. This might also lead nicely into the next Chapter where I believe you do touch on this somewhat.

One other general comment: I suggest a note about satellite data products. I have seen products like land surface temperature from MODIS or LANDSAT or NDVI used in some climate econometrics studies, so I think that there should be some mention of these in the Gridded Data section. Perhaps, just a warning that these can be highly uncertain, served on unconventional grids and that collaboration with a climate scientist is recommended. You could also mention that there are some blended satellite+ground-based observational products (e.g. CHIRPS that you mention).

Below are mostly minor comments:  

Section 1: Using Weather and Climate Data

ks905383 commented 1 year ago

Thank you for the helpful comments! Will take a look at this shortly.

kls2177 commented 1 year ago

The full review is taking me a bit longer than I thought, so I probably won't be finished until next week. Sorry about that!

ks905383 commented 1 year ago

Thank you for the comments! I have addressed them in merge #85.

Responses to major comments

One example could be a time series of the volume of weather the climate data available

Added.

In the Hands-On Excercise, Step 1 section, it might be useful to ask students to reflect on the components of variability - is there a trend? a clear seasonal cycle? other low and high-frequency variability? Plotting the data is a good way for them to check that the steps they have taken make sense. Maybe asking them to plot a map of the time mean would be a useful exercise.

This was a great idea - I've now added a section asking them to plot and reflect on a sample time series and a sample map, to think about their data (and also to make sure it has been pre-processed correctly).

I suggest a note about satellite data products. I have seen products like land surface temperature from MODIS or LANDSAT or NDVI used in some climate econometrics studies, so I think that there should be some mention of these in the Gridded Data section. Perhaps, just a warning that these can be highly uncertain, served on unconventional grids and that collaboration with a climate scientist is recommended. You could also mention that there are some blended satellite+ground-based observational products (e.g. CHIRPS that you mention).

Good point. This has been added.

Responses to (major) minor comments

Overall, this section seems a bit too long.

Yeah, I think you're right - I've split up the section into more manageable chunks. I think it flows better now as well.

when you list "run" in your terminology bullet list, I don't agree with your statement "Don't worry about this".

Yes, fully agree, thanks for catching. The wording has been changed, and also made reference to uncertainty ensembles that are sometimes provided with observational data products. If we expand this guide to include a section on future climate data, this will certainly be expanded on as well.

Further changes In addition to the comments made above, I've also:

ks905383 commented 1 year ago

Detailed responses to minor comments:

Sub-section: The NetCDF Data Format

Is there an assumption that students use STATA specifically? This reference to STATA kind of came out of nowhere.

This has been clarified; STATA is commonly used in economic analysis.

  • Python(NetCDF4) link doesn't work

Fixed.

nco -> maybe direct students to use ncdump -h for meta data only rather than ncdump because it is usually way too much of a data dump.

Good point, this has been updated.

As an open-source alternative to MATLAB, Python(NetCDF4) + numpy works very well and was typically the way Python users worked before xarray was developed.

Agreed; we're definitely keeping the mention in this section - though I think, at least for the purposes of this tutorial, we'll stick with xarray for simplicity for the rest of the code chunks.

Sub-section: NetCDF Contents

it might be helpful to show a schematic of the netcdf data model: https://docs.unidata.ucar.edu/netcdf-c/current/netcdf_data_model.html

Added, thanks for the suggestion.

Sub-section: NetCDF Header

I realize that you want students to do most of the work themselves and not provide them with data sets as examples, but a picture is often worth a thousand words. It would be nice to show the file header - you can use the xarray sample data so that you don't have to rely on external data sources: https://tutorial.xarray.dev/fundamentals/04.1_basic_plotting.html. You could also use this sample data in the plotting section (which I would also recommend).

Agreed - showing an image of the header is definitely a good idea. Using arrays built-in sample data

Sub-section: Attributes

no_sleap -> no_leap

Fixed.

Sub-section: Basic Vis...

the correct title is "An Introduction to Earth and Environmental Data Science"

Fixed.

Cartopy section link not working

Fixed

Sub-section: 2-D plotting

Definitely worth bringing up - I've changed the example to note this.

Sub-section: Maps

I've added the following text:

Note that which map projection you use will influence how you read the map. In the code examples below, we will use an equal-area projection, in which every grid cell in the gridded data is shown with its accurate relative area, to avoid visually overemphasizing data in regions with smaller geographic extent. To see which other projections are available, see the relevant parts of the documentations (here for cartopy/python, and here for Matlab)

Sub-section: Gridded Data

  • General note: it seems that the term “climate data” is being used to refer to “climate model output”. Climate data is not exclusively model output. For example, a 30-year climate normal from a weather station would be considered climate data.

I've changed references to 'data' to be more clear to what they're referring to (specifying climate model output, or historical "observational" output).

Sub-section: Reanalysis Datasets

  • Products also differ by which assimilation scheme is used.

I've clarified:

Historical data products differ by how they “assimilate” (join observational with model data) or combine data, and how much “additional” information is added beyond (pre-processed) station data.

Sub-section: Warning, Station Data

  • GHCN link not working

Fixed.

kls2177 commented 9 months ago

@ks905383

I appreciate these updates. I particularly like the updates to the Hands-On Exercise.

Lots of python issues:

geo_lims = {'lat':slice(23,51),'lon':slice(-126,-65)}

to

geo_lims = {'latitude':slice(23,51),'longitude':slice(-126,-65)}

and

ds.tas.sel(lon=-118.2,lat=34.1,method='nearest').plot()

to

ds.tas.sel(longitude=-118.2,latitude=34.1,method='nearest').plot()

and

# Plot the day-of-year average (ds.tas.sel(lon=-118.2,lat=34.1,method='nearest'). groupby('time.dayofyear').mean()).plot()

to

# Plot the day-of-year average (ds.tas.sel(longitude=-118.2,latitude=34.1,method='nearest'). groupby('time.dayofyear').mean()).plot()

For the cartopy plotting section, there are a few issues:

First, the line that is supposed to compute the summer average is not:

# Get average summer temperatures ds_summer = ds.isel(time=(ds.time.dt.season=='JJA'))

should be something like this, correct?

# Get average summer temperatures ds_summer = ds.isel(time=(ds.time.dt.season=='JJA')).mean(dim='time')

Also, when plotting, you define a projection, ax, but then you don't reference it when you plot. I could only get a map to show up with this code:

ax.contourf(ds_summer.longitude,ds_summer.latitude,ds_summer,transform=ccrs.PlateCarree(),levels=21)

Finally, when you save the data in the python implementation, you save it to a directory called "sources". This should be called "data".

jrising commented 8 months ago

@kls2177 Thanks for identifying these. We meant to convert the variable names, but have now decided to leave the conversion to the end. In particular,

The relevant commits are 28ad4a9edd63775f8fe7cdc7808aadb3e26d6007 and 7815a99294b405a3c4983bf6f2be4897bcd230f4.