JGCRI / RCMIP5

R scripts for processing CMIP5 data
Other
32 stars 10 forks source link

Include global means of key variables in package #129

Open bpbond opened 9 years ago

bpbond commented 9 years ago

We've gotten permission from LLNL's Karl Taylor (email excerpted below) to include summarized data with the next release of RCMIP5. Let's use this issue to come up with a list of variables, and @cahartin you and I can prepare and include them?

You may include the global means as part of the R package. You should be sure to do the following:

  • Update your global means, when necessary, to accurately reflect the CMIP5 archive. This will prevent known flawed data from being distributed.
  • Indicate to users that the original CMIP5 data (from which your global means are calculated) can be accessed through the ESGF data portals (see http://pcmdi-cmip.llnl.gov/cmip5/availability.html).
  • Provide some indication that the modeling groups themselves have not provided the global means but these have be derived by you. You should also provide information about the algorithm used (for example, you would want to tell them that (I presume) you've weighted the grid cell values by the area of the cells).

I think only the first point will require much effort on your part. The archive is relatively stable now, but you might check a couple times each year that no data have been withdrawn/replaced.

bpbond commented 9 years ago

Candidate variables to include (global and annual, by model and experiment, ensemble means):

Others?

cahartin commented 9 years ago

That sounds like a great idea. One more to add:

fgco2 - air-sea CO2 flux

ktoddbrown commented 9 years ago

pr - precipitation

I would prefer to put in the land carbon stocks (cVeg, cLitter, cSoil, cCwd) instead of nbp and the primary fluxes (gpp, ra, and rh) since LUC is included in some models but not others in their nbp/npp calculations. Although that will be showing our clear ecology biases in the variable selection :)

bpbond commented 9 years ago

Hey @cahartin I have added the necessary infrastructure for including this dataset with the package, following the instructions at http://r-pkgs.had.co.nz/data.html.

Next steps:

ª Fields should include: variable, model, experiment, year, value, value_sd...what else?

cahartin commented 9 years ago

sounds good @bpbond. What about ensembles? do we want to average them? or get global means for each ensemble?

bpbond commented 9 years ago

If we have the individual ensembles already processed, sure, why not break them out separately.

cahartin commented 9 years ago

in the RCMIP5 package is there an option for calculating standard deviation?

bpbond commented 9 years ago

Between ensembles, during the loadCMIP5 process, you mean? No; mean, max, min, and sum are the only ones supported. You'd need to load the ensembles individually, combine the data, and calculate the sd.

cahartin commented 9 years ago

I guess I'm not 100% sure what standard deviation we want to calculate: Fields should include: variable, model, experiment, year, value, value_sd...what else?

bpbond commented 9 years ago

Oh, for me that was just the global s.d. (i.e. between grid cells). So it's just a second call to makeGlobalStat.

cahartin commented 9 years ago

that makes sense. I'll add that in now.

cahartin commented 9 years ago

I am beginning to reprocess some of the variables. Here is my complete list of variables that will go into the v1.2 ocean: ph, tos, spco2, fgco2 land: nbp, npp, cVeg, cLitter, cSoil, cCwd* atmosphere: pr, co2, tas

ktoddbrown commented 9 years ago

What do people think of using gpp, ra, rh instead of npp/nbp? I understand that npp/nbp are inconsistently defined between the models with some including luc and some not. cCwd is minor (only CLM models have it) but required for C-balance.

You would also need to download the areacella and sftlf files to get the correct land area.

@cahartin I can process the land variables but I would also need to double check that I have all of them. I've focused on a subset of 11 'representative' models. If you commit the code to the repository here (@bpbond where do you think it should go? I'm tempted to say in the data directory but that feels wrong.) I can sync and run everything.

cahartin commented 9 years ago

@ktoddbrown if its only a few files that's not a problem to download. I have areacella and sftlf already downloaded. That brings up a good point. Do we want all models or just 11 representative models? (i used 10-11 models in my analyses as well)

bpbond commented 9 years ago

Re code location, I'd say for now put it in unused/ so it doesn't mess up the package build. Slightly longer term, probably let's make it into an internal RCMIP5 function, not normally accessible to the user, but there for our convenience and to document how these data were generated.

Re models, we're providing model-specific data, so might as well provide everything we have on hand, not just 11.

cahartin commented 9 years ago

@ktoddbrown and @bpbond to keep things simple and in time for the next release, I decided to just include the major variables (tas, tos, co2, and pr). I have all of the data and the time to process these. We can always add more variables to later releases. Does this work for you both?

bpbond commented 9 years ago

Yes. You (we) have lots of things going on, so let's keep it simple. We can expand in the future.

cahartin commented 9 years ago

pr is a flux. we want global sums of pr, not global averages, correct?

bpbond commented 9 years ago

Ideally. But we could just do all means, that's fine too, and leave the multiplication to the user. Simpler for us.

On Jun 15, 2015, at 3:29 PM, Corinne Hartin notifications@github.com<mailto:notifications@github.com> wrote:

pr is a flux. we want global sums of pr, not global averages, correct?

— Reply to this email directly or view it on GitHubhttps://github.com/JGCRI/RCMIP5/issues/129#issuecomment-112181071.

ktoddbrown commented 9 years ago

I would suggest using a global sum where we can to avoid confusions over ocean vs land vs surface area.

cahartin commented 9 years ago

thanks.

bpbond commented 8 years ago

As noted in my last commit message, I'm assuming we're putting this off until 1.3.