PecanProject / pecan

The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.
www.pecanproject.org
Other
200 stars 231 forks source link

PEcAn issue with reading ED2 output #478

Closed Viskari closed 8 years ago

Viskari commented 9 years ago

When running pecan with the current git version of ED2, I get the following error message:

model2netcdf.ED2('/data/tviskari/pecan/out/ENS-00001', 45.92, -90.45, '2004/01/01', '2004/12/31') [1] "2003<2004" [1] "----- Processing year: 2004" 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_BDEAD in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_PLANT_RESP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_CO2CAN in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_GPP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_HTROPH_RESP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_GPP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_PLANT_RESP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_HTROPH_RESP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_GPP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_PLANT_RESP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_HTROPH_RESP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_PLANT_RESP in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_FSC in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_STSC in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_SSC in ed hdf5 output. 2015-05-11 16:01:36 WARN [getHdf5Data] : Could not find AVG_SOIL_TEMP in ed hdf5 output. Error in array(0, ncol(soiltemp)) : 'dims' cannot be of length 0 Calls: model2netcdf.ED2 -> array Execution halted

Looking at the ED2 ed_statevars.f90 in an older version and the new version, there are a lot of variables relating to those missing in the new version, with most of the AVG* files are missing, especially the NACP comparison variables.

I can start working on this, but first I need confirmation whetever or not try to have those added back to ED2 or to alter pecan?

dlebauer commented 9 years ago

@mdietze is this something that should be fixed in PEcAn or a feature in ED2 that should be re-implemented, or other?

mdietze commented 9 years ago

It appears that the variable names have shifted from AVG* to FMEAN*_PY (e.g. FMEAN_GPP_PY). @Viskari do these variables exist in the -T- outputs? If so we should be able to update the pecan ED module to check for which version of the variable names exist and then read either (we don't want to loose the backward compatibility).

Viskari commented 9 years ago

The variables don't come up when I list the output in the -T-outputs, although when checking the ED2 variables list, it should be written to file somewhere. I am still checking this out.

Viskari commented 9 years ago

FMEAN_*_PY variables are written out in the -I- outputs instead of the -T- outputs. How do you wish to proceed with this?

mdietze commented 9 years ago

Hmm. after poking around an older version of the ED code it looks like almost all the :opti variables have been dropped from ed_state_vars. It's probably worth moving this issue the ED2 git hub in order to discuss with the larger community about why those variables were dropped and whether it makes more sense to reinstate all of them or whether they all have FMEAN equivalents. We definitely need them back in. As a test you might try adding :opti to their output flag list to see if that fixes the problem or if the sub-daily time averaging is being done wrong. Also it would be good to check that ALL dropped :opti variable have FMEAN equivalents, or whether there's some that need to be added.

Viskari commented 9 years ago

So progress on this. After bringing this up in ED2 git, turns out that the first step was just adding the correct output key to the ED2, which I have now done and it writes the FMEAN equivalents to the T-analysis file.

Next two steps will be to check that everything is still there and change the pecan output reading file.

Viskari commented 9 years ago

Alright, after going through FMEAN_ variables and trying my best to connect them with the AVG_variables, the following were missing:

AVG_BDEAD AVG_BALIVE AVG_HTROPH_RESP AVG_FSC AVG_STSC AVG_SSC

I already posted on the ED2 issue if they have suggestions what to use instead, but also thought to post the question here if there are suggestions. Also, what is the difference between STSC and SSC, as what I am able to find in ED2 seems to indicate both represent Soil Structural Carbon?

Viskari commented 9 years ago

Got answers from Marcos, will do the corrections later today.

Viskari commented 9 years ago

So turns out that there is a slight issue. There is no current scalar outputs for BALIVE and BDEAD, which pecan output files are reading, and they have seemingly been removed from the code.

There is some hesitation from some members of their development for an output save that is more frequent than how often it is determined. So I thought to also check here if there are thoughts on other ways to solve this issue?

mdietze commented 9 years ago

PEcAn doesn't actually need AVG_BDEAD and AVG_BALIVE specifically, it just needs to be able to calculate total aboveground wood and total aboverground biomass, which you might be able to get from other patch-level variables

Viskari commented 9 years ago

The total above ground biomass wasn't a problem, at least from what I could tell, as it was something already being read by pecan and ED2 is not protesting adding that to the tower data.

The total above ground wood is showing to be a bit more problematic and I am currently looking in to it.

Viskari commented 9 years ago

While trying to figure out and waiting for thoughts/decision on how to handle the biomass/above ground wood mass, I've moved to start addressing the climate forcing information output reading.

Concerning that, I wanted to check that all the radiate terms read by pecan should be on the leaf-level, correct? As that is how FMEAN variables seem to be presented.

mdietze commented 9 years ago

Toni, could you clarify exactly what variables you are referring to?

By the time they get to PEcAn all variables should be on a polygon (grid cell) basis. If something is really on a per m2 of leaf area basis it will need to be scaled up to a patch level by summing the product of the flux * LAI * cohort density over all cohorts. Then patches need an area-weighted sum to get up to sites, and sites need an area-weighted sum to get up to polygons.

Viskari commented 9 years ago

I am talking about AVG_PAR_BEAM and AVG_PAR_DIFFUSE. In the Git version of ED2, the fmean variables are FMEAN_PAR_L_BEAM_PY and FMEAN_PAR_L_DIFF_PY and are defined by the direct and diffuse radiation absorbed by leaves. Because of the small difference in naming, I wanted to check that these are the variables in question.

Viskari commented 9 years ago

Alright, going through the climate forcing variables read in model2netcdf.ED.R I managed to match most of the variables with FMEAN_*_PY output variables, but I wanted to check/comment on some of the variables:

AVG_SNOWDEPTH/SNOWFRACLIQ/SNOWTEMP/SNOWMASS: As far as I can tell, these do not have a corresponding FMEAN variable.

AVG_PAR__: As mentioned in the comment above, the FMEAN for these is defined as absorption by the leaves. Is this also what the AVG_PAR__ is supposed to represent?

AVG_EVAP/AVG_TRANSP: There are FMEAN variables for these, but those are defined as the evaporation and transpiration from the leaf. Is this also what AVG_EVAP/AVG_TRANSP are supposed to represent?

AVG_VEG_TEMP: There is a leaf temperature in FMEAN variables. Does this correspond with AVG_VEG_TEMP?

AVG_SOIL_FRACLIQ: As far as I can tell, there is no FMEAN variable for soil liquid fraction, but there is a variable called FMEAN_LEAF_FLIQ_PY that is given a definition of liquid fraction. I just want to check that this is not variable we are interested in here, correct?

Viskari commented 9 years ago

Just reminding on this issue as I would like to have answers for these before going again over to the ED2 issue to discuss this. For example if the snow variables are not needed, then I'll just skip those.

mdietze commented 9 years ago

Toni, it probably makes sense to get this running in two passes, one for the core variables and the second for other ancillary variables. In reality we'd like to support all the variables in the union of the MsTMIP http://nacp.ornl.gov/MsTMIP_variables.shtml and PalEON protocols, but we weren't up to that previously so it won't kill us to loose a few others temporarily.

Also, in the model2netcdf.ED2 you'll see that a few variables are calculated from other variables in the ED output, which will probably have to occur again. In terms of the snow variables, we do want to get at least depth or mass working since it really is important at northern sites to know if we've got the timing and amount of snow correct, otherwise we'll get winter soil temperatures wrong, which can cause summer rates of GPP, Ra, and Rh to go wonky once you start doing data assimilation (I've seen this happen before at both Bartlett and Toolik)

For PAR, the MsTMIP protocol has fPAR (Absorbed fraction incoming PAR) while the PalEON protocol also has the incident PAR as a sanity check that the model is actually reading the inputs correctly.

EVAP and TRANSP should be on a per ground are basis, not per leaf area

VEG_TEMP is leaf temp

ED definitely has the equilvalent of SOIL_FRACLIQ as an internal state variable. It's probably find to output the instantaneous rather than the 30 min mean. You definitely want the soil variable not the leaf variable.

Viskari commented 9 years ago

Okay, so after checking the code through agian and discussing with Marcos in ED2 issue, I think we have no all the other variables read in except the BDEAD. The snow terms have to be read in as the surface water terms as they combined them in to that.

Viskari commented 9 years ago

I've been tinkering with model2netcdf for ED2 and ED2 outputs and there are still a few things causing issues.

The BDEAD is still an issue, but based looking through code and discussion here, we essentially have two options. Either I alter ED2 so that it writes out BDEAD, or whatever value we end up choosing, in to the tower file. The second approach is that we create a new function which reads the value from the monthly files. At the moment I have set a constant value for BDEAD in order to get past it.

The second issue is with LAI. There is no FMEAN_LAI_PY, at least I haven't seen any, but there is LAI_PY. However, that is written for each PFT and thus has four dimensions, which the pecan code can't handle. Again, I can create a variable in ED2 for this or create a function in pecan. Which one would be preferable?

mdietze commented 9 years ago

"At the moment I have set a constant value for BDEAD in order to get past it."

@Viskari it is far preferable to NOT output a variable than to have it set to a value that wasn't calculated by the model.

For BDEAD and LAI, there's no need for a FMEAN variable since neither is changing on a subdaily timescale. Are there Daily or Monthly timescale variables that you can add to the Tower file? If not then I favor the first option (alter ED2 to generate a polygon-level mean BDEAD that we can then output in the Tower file).

If LAI_PY is written out by PFT then the PEcAn code can easily handle a simple sum of those values as part of the processing. You say there are four dimensions however -- what are the other 3 after PFT?

Viskari commented 9 years ago

After tinkering with ED2 and pecan, I think I've gotten now all the relevant outputs in to the tower file and pecan reading it in. There are still some problems/things I want to check:

1) The way I set BDEAD and LAI_PY is by summing the values over each cohort. I need to change that a little bit for it to weight by the NPLANT, but I wanted to check is that still the way we want pecan to treat these values?

2) So for me to get those values out, I had to do a little bit more changes in pecan as I wished and had to add the new variables in ED2. After finishing, I'm going to bring this up in the ED2 github discussion, but I don't know how enthusiastic they will be for these additions as they are solely for the benefit of pecan.

3) pecan is still giving me an error message, but I don't quite know how to fix this. The error message is:

2015-06-10 17:24:05 INFO [mstmipvar] : Don't know dimension for 2 for variable CarbPools

mdietze commented 9 years ago

@Viskari in response to your previous questions

1) to go from a cohort-level variable, X, to a polygon level one you need to first need to sum over X * NPLANT at a patch scale to get the patch variable. Then you need to sum over the patches within a site, weighting by patch area. Then you need to sum over the sites within a polygon, weighting by site area. All the patch and site areas are fractional (between 0 and 1) where the sum of the areas = 1 (so you don't have to divide by the sum of the weights to get a weighted average)

2) I wouldn't worry about adding variables BACK IN to ED2 that were there previously and are required for PEcAn.

3) that's odd, did you change the dimensions? In the old code the second dimension was lon, which needed to be defined for the previous variables as well. As a temporary fix you could drop it completely since it's just being set to -999 (NA).

Viskari commented 9 years ago

1) Tweaked this, as far I understood it should now be correct in the code. If you wish to check it, the calculation is done in src/io/average_utils.f90. If you search for cpatch%fmean_lai, you should see the equation I am using.

2) Made a pull request for ED2 containing the adjusted files.

3) I haven't changed the dimensions at all, or at least as far as I know, I've been looking in to the code, seeing if I am able to figure out what is happening, but couldn't find anything so far. I'll check about dropping it, but I would rather find away to fix it.

Viskari commented 9 years ago

3) Continuing. Turns out I can't comment it out as the rest of the workflow is expecting a variable for var(3)

serbinsh commented 9 years ago

Hi @Viskari Looks like I am having a similar issue using the latest PEcAn libs and the latest Git version of ED2.

 - Simulating:   12/30/2004 00:00:00 UTC
 - Simulating:   12/31/2004 00:00:00 UTC
Total count in node        1 for grid        1 : POLYGONS=        1 SITES=        1 PATCHES=        1 COHORTS=        9
 === Time integration ends; Total elapsed time=      100.4  ===
 ------ ED-2.2 execution ends ------

R version 3.2.0 (2015-04-16) -- "Full of Ingredients"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> require (PEcAn.ED2)
Loading required package: PEcAn.ED2
Loading required package: PEcAn.utils
Loading required package: abind
Loading required package: plyr
Loading required package: XML
Loading required package: ggplot2
Loading required package: randtoolbox
Loading required package: rngWELL
This is randtoolbox. For overview, type 'help("randtoolbox")'.
Loading required package: ncdf4
Loading required package: udunits2
Loading required package: RCurl
Loading required package: bitops
Loading required package: coda
Loading required package: stringr
> model2netcdf.ED2('/data/sserbin/Modeling/ED2_Modeling/US-WCr/PEcAn.US-WCr/pecan/out//SA-median', 45.92, -90.45, '2004/01/01', '2005/01/01')
[1] "2003<2004"
[1] "----- Processing year:  2004"
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_BDEAD in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_PLANT_RESP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_CO2CAN in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_GPP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_HTROPH_RESP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_GPP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_PLANT_RESP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_HTROPH_RESP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_GPP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_PLANT_RESP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_HTROPH_RESP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_PLANT_RESP in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_FSC in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_STSC in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_SSC in ed hdf5 output. 
2015-06-24 09:51:28 WARN   [getHdf5Data] : 
   Could not find AVG_SOIL_TEMP in ed hdf5 output. 
Error in array(0, ncol(soiltemp)) : 'dims' cannot be of length 0
Calls: model2netcdf.ED2 -> array
Execution halted
serbinsh commented 9 years ago

Actually, it seems that the output HDF5 files are missing many, many output variables:

screen shot 2015-06-24 at 10 12 36 am

Again, I am using the Git version of ED2. I suspect this is an ED2 bug?

@robkooper @mdietze

serbinsh commented 9 years ago

@Viskari

FYI - Your version of ED2 does contain the full suite of variable outputs in the HDF5 files, but I am getting 0 GPP from the runs. Leaf resp, for example, looks normal but no productivity. I will using a more comprehensive css/pss.

So at this time I think there is an output issue with the mainline ED2 code. At least I am finding a lot of missing output vars

S

mdietze commented 9 years ago

Yes, that's an ED2 problem

@Viskari is your fork of ED2 on github up to date with the changes you made (output variables you added)? @serbinsh have you tried pulling those changes from Toni's branch? Also, are there any cohorts in your css and are they surviving?

Viskari commented 9 years ago

My version of ED2 is up to date with variable names, but I don't think the mainline ED2 has accepted my push which included those variables.

On Wed, Jun 24, 2015 at 10:59 AM, Michael Dietze notifications@github.com wrote:

Yes, that's an ED2 problem

@Viskari https://github.com/Viskari is your fork of ED2 on github up to date with the changes you made (output variables you added)? @serbinsh https://github.com/serbinsh have you tried pulling those changes from Toni's branch? Also, are there any cohorts in your css and are they surviving?

— Reply to this email directly or view it on GitHub https://github.com/PecanProject/pecan/issues/478#issuecomment-114898622.

Viskari commented 8 years ago

This should now be all included in the mainline and functions as should.