DrylandEcology / STEPWAT2

folder
4 stars 5 forks source link

Biomass values in year 1 with spinup requested often 0 #363

Open kpalmqui opened 5 years ago

kpalmqui commented 5 years ago

@chaukap

The purpose of the spinup is to start each cell with vegetation established. However, with spinup requested as initialization, some of the functional types in some cells start with biomass = 0 in year 1. Presumably, this is because 0 was the value in the last year of the spinup. This is not the desired functionality.

Some potential fixes: 1) pass the mean biomass across all years of the spinup
2) pass the mean biomass of all non-zero biomass years of the spinup 3) pass the last non-zero observation of spinup

I am leaning towards 2). What do you think?

chaukap commented 5 years ago

@kpalmqui The only problem I can see with 2) is how will we know how many individuals to establish based on the mean biomass?

If we can come up with an algorithm to determine how many individuals should take the biomass I am all for it!

chaukap commented 5 years ago

@kpalmqui Now that I think about it, I'm pretty sure we discussed a similar issue over the phone a while ago. You were wondering if we could run multiple iterations of spinup, then average them, and the only concern I raised was how we would translate these averaged values into actual individuals. For example, if we know that the average sagebrush biomass is 232 grams, how many individuals do we establish on the plot? How old are they? These are questions that would be difficult to answer by just averaging over iterations.

We would run into similar issues trying to average over a single iteration. It can definitely be done, but we need to create a function that translates the averaged values into a number of individuals.

By the way, could you upload the grid_initSpecies.csv file you used so I can make sure there isn't a bug in the code? Thanks!

kpalmqui commented 5 years ago

@chaukap I am a little confused about your question - what does the spinup have to do with how many individuals to establish in year 1 of the actual simulations?

I believe you are pointing out it will be difficult to accumulate biomass across multiple years during 1 iteration of spinup?

kpalmqui commented 5 years ago

grid_setup.txt

chaukap commented 5 years ago

@kpalmqui accumulating the biomass wouldn't be a problem thanks to the accumulators in ST_stats.c. I suppose my confusion comes from your use of the term "passing". Functional types do not store their biomass as a variable. Biomass is only stored at the individual level, so passing mean biomass of non-zero years to the simulation would mean translating the value into multiple individuals.

I'll try to show you the problem with a scenario. Let know if theres anything I'm misunderstanding:

Lets say, at the end of spinup, sagebrush has a biomass of 102.4 with 5 individuals established but on average there was a mean non-zero biomass of 255.3 with 14.32 individuals established. We would want to bump the number of sagebrush to be more reflective of the average values.

My main question is how do we use these two values to initialize the simulation? The RGroup and Species variables would be fine because they contain mostly constants, but the list of individuals of sagebrush would need to be modified. We would need to start year 1 of the simulation with with 14 sagebrush individuals each with a biomass such that sum(individual biomass) = 255.3. Do we divide the biomass evenly between these individuals so each has a biomass of 18.2357? This is most likely not reflective of the state of the program during spinup. The next question is the age of the individuals. If the only things we pass from spinup to simulation are the mean biomass and the mean number of individuals how do we determine how old the simulation plants need to be? We could start them off at 1 year old, but it would be unrealistic for them to have so much biomass and only be a year old. We could randomly assign them an age, but imagine the user requesting spinup for 50 years and at the start of the simulation there is a 99 year old sagebrush with bmass = 18.3257. This wouldn't make any sense.

Hopefully this helps you understand my concern. If you have any more questions I'm open to a video or voice call to clear things up.

chaukap commented 5 years ago

Maybe I should describe a little better how the program goes from spinup to simulation. That might help get us on the same page.

ST_grid.h declares one CellType** called gridCells. It stores all of the information for all cells in a 2d array.

ST_initialization.h declares another CellType** called initializationCells.

When spinup ends we do the following:

initializationCells = gridCells
reallocate gridCells

This saves the entire state of the program to initializationCells, then wipes gridCells clean.

When we get to the actual simulation we call loadInitializationConditions() to reload initializationCells into gridCells. This is the only point in the program where we have no choice but to deep copy. loadInitializationConditions() deep copies Species (and the list of individuals), RGroup, Env, Plot, and Succulent from initializationCells into gridCells. In this way we pass the end state of initialization to the start of the simulation.

kpalmqui commented 5 years ago

The purpose of the spinup is to begin the actual simulation runs with established vegetation. Our goal is to first get this working within gridded mode, but then also make spinup available within non-gridded mode.

Currently when spinup is run as initilization, we run spinup for 1 iteration and 300 years. We then initialize the actual simulation run in year 1 with the number of established individuals in year 300 of spinup. We use the same "spinup" as the starting point for every iteration within the actual simulation run.

This causes some problems because occasionally, there are no individuals established for certain functional groups in year 300 of spinup, so we are essentially starting the simulations with no established individuals. This is not the desired functionality of the spinup.

@chaukap and I have been discussing how to proceed and because this is a very important decision, we are hoping to start a discussion here and get your opinion @dschlaep on the way to move forward.

Some options (and there may be more, but this is my initial list):

  1. Continue to initialize year 1 of the actual simulation runs with year 300 of spinup, but modify some of the processes that occur within the spinup so that individuals are established in most years. For example, turn off all disturbances (already in place), force establishment of all rgroups each year via input parameter estann in rgroup.in, etc.

  2. Instead initialize with "average" number of individuals established across all years of the spinup or all years of the spinup when biomass is non-zero. This will take some careful thought as we will have to make some decisions about how large each individual is and how old each individual is to properly initialize year 1 of the actual simulation runs. We could use the average and SD of ages, relsizes, and number of individuals across the years of the spinup to inform these decisions. I will let @chaukap add additional detail here if that is necessary.

  3. Rather than running spinup for 1 iteration, run spinup for 10 iterations and then initialize with those conditions. This would be ideal, but will be slower than the current implementation. In addition, this will result in a lot of the same challenges/decisions outlined in 2. @chaukap there may be more detail to add here.

@dschlaep I realize that you are not very familiar with gridded mode or the gridded_code branh, but if you have time, we would love to have your feedback on this issue. Perhaps we can have a conversation in the near future to discuss how to proceed with spinup. We can also answer any questions you have and explain further during that call.

dschlaep commented 5 years ago

This causes some problems because occasionally, there are no individuals established for certain functional groups in year 300 of spinup, so we are essentially starting the simulations with no established individuals.

--> I think that this is only a problem because we currently restrict establishment to be a function of last year's biomass (for perennials), i.e., we don't simulate a seed bank. Are there any thoughts on incorporating a seed bank for every species?

kpalmqui commented 5 years ago

@dschlaep perennial establishment at the moment is not related to last year's biomass - it is a function of pestab and a random number draw.

We have been discussing incorporating a seed bank for perennials for the seed dispersal code within gridded mode, but we have yet to decide the level of detail we want to implement there. When I last spoke with Bill, he seemed inclined to keep the implementation as simple as possible, and not necessarily track the seed bank for each species in each year. However, that could change as we really begin those discussions.

Do you think it wise to hold off on any additional changes to the spinup until we decide the logic and level of detail needed for seed dispersal? See the start of that discussion here: https://github.com/DrylandEcology/STEPWAT2/issues/309

chaukap commented 5 years ago

@dschlaep It seems to me like this issue comes down to the stochastic elements of the program causing spinup to vary too much. By this I mean that 1 iteration of spinup is not reflective of all possible outcomes of the program. The only way for us to get a well-rounded picture of all plots is to run multiple iterations and average.

As I mentioned above, the only issue is deciding how to translate average values into real individuals. We currently have functionality for recording the average number of individuals and average biomass across iterations. However, this information was never intended to be used to seed another iteration. In my last comment I outlined a scenario that would need to be addressed if we are to run multiple iterations of spinup.

Here's a diagram of what our function would need to do.

average biomasses ---------------|                             |--> (different) biomasses for each individual.
                                 |   |----------------------|  |
average number of individuals ------>| Our mapping function |-----> Concrete (rounded?) number of individuals.
                                 |   |----------------------|  |
average age of individuals ------|                             |--> (different) Ages for each individual.

It would be convenient if we could just directly translate the values:

207.4 ---|                            |--> 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9
         |  |----------------------|  |
11.3 ------>| Our mapping function |-----> 11
         |  |----------------------|  |
12.6 ----|                            |--> 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13

But having 11 individuals that are all 13 years old and 18.9 grams in not reflective of any state that the program would ever be in.

We need something like

207.4 ---|                            |--> 12.4, 22.5, 18.6, 20.0, 17.1, 14.2, 22.2, 30.0, 7.7, 28.1, 5
         |  |----------------------|  |
11.3 ------>| Our mapping function |-----> 11
         |  |----------------------|  |
12.6 ----|                            |--> 8, 14, 11, 12, 10, 9, 13, 20, 3, 19, 2

To reflect a state the program could really be in.

Keep in mind that I made these values up. They are most similar to sagebrush averages but the point is that we would have to do this for each species.

@kpalmqui Suggested using the standard deviations to determine variable ages and biomasses for each individual centered around the averages. This seems like the best option to me.

dschlaep commented 5 years ago

Ok, sorry, I'm slow here, but why is it a problem at all that some species are not established at the end of the spinup?

An absent species either means that the conditions are not suitable or that the species fluctuates a lot. Either is representative of that site. And since recruitment is not restricted, the simulation run will add individuals of those species that started out with 0 at some time if resources allow.

I guess that I don't quite understand the purpose of this spinup. If we want the spinup to represent the range of possible vegetation states, then the ultimate solution is to run a spinup for each iteration -- in other words, we simulate 600 years and declare the first 300 years as "spinup" for the second set of 300 years. This is what we are currently doing, we run the first 150-250 years as "spinup" and then take the average across the last 50-150 years as the "simulation" run.

On Aug 5, 2019, at 17:37, Chandler Haukap notifications@github.com wrote:

@dschlaep https://github.com/dschlaep It seems to me like this issue comes down to the stochastic elements of the program causing spinup to vary too much. By this I mean that 1 iteration of spinup is not reflective of all possible outcomes of the program. The only way for us to get a well-rounded picture of all plots is to run multiple iterations and average.

As I mentioned above, the only issue is deciding how to translate average values into real individuals. We currently have functionality for recording the average number of individuals and average biomass across iterations. However, this information was never intended to be used to seed another iteration. In my last comment I outlined a scenario that would need to be addressed if we are to run multiple iterations of spinup.

Here's a diagram of what our function would need to do.

average biomasses --------------- --> (different) biomasses for each individual.
average number of individuals ------> Our mapping function -----> Concrete (rounded?) number of individuals.
----------------------
average age of individuals ------ --> (different) Ages for each individual.

It would be convenient if we could just directly translate the values:

207.4 --- --> 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9, 18.9
11.3 ------> Our mapping function -----> 11
----------------------
12.6 ---- --> 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13

But having 11 individuals that are all 13 years old and 18.9 grams in not reflective of any state that the program would ever be in.

We need something like

207.4 --- --> 12.4, 22.5, 18.6, 20.0, 17.1, 14.2, 22.2, 30.0, 7.7, 28.1, 5
11.3 ------> Our mapping function -----> 11
----------------------
12.6 ---- --> 8, 14, 11, 12, 10, 9, 13, 20, 3, 19, 2

To reflect a state the program could really be in.

Keep in mind that I made these values up. They are most similar to sagebrush averages but the point is that we would have to do this for each species.

@kpalmqui https://github.com/kpalmqui Suggested using the standard deviations to determine variable ages and biomasses for each individual centered around the averages. This seems like the best option to me.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DrylandEcology/STEPWAT2/issues/363?email_source=notifications&email_token=ABACTW3W2YRS4NYVN3OA2ELQDBCMJA5CNFSM4IGV7UG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3SGR5Y#issuecomment-518285559, or mute the thread https://github.com/notifications/unsubscribe-auth/ABACTW7D7JOIRFN4BMGQWK3QDBCMJANCNFSM4IGV7UGQ.

kpalmqui commented 5 years ago

I see @dschlaep 's point, but the issue I was trying to highlight is that since spinup is only run for 1 iteration and since we only pass the final year of that 1 iteration, it is not necessarily representative of the vegetation that a given site can support. For example, if only 1 sagebrush individual is established and it happens to die in the final year of spinun, then the simulation will start with sagebrush biomass of 0, despite sagebrush being the dominant species at the site.

The purpose of the spinup is to allow vegetation to establish before the actual simulation runs, with the ultimate goal of decreasing the number of years we actually need to run the simulations.

@chaukap what do you think about moving this issue into the upcoming milestone/branch that will overhaul seed dispersal initialization? We could expand/reframe that milestone to include all gridded mode initialization. I don't anticipate we will be apply to resolve the logic of this issue within the next two weeks, due to a conference Daniel and I are attending next week and a week of fieldwork for me that will follow the conference.

chaukap commented 5 years ago

@kpalmqui I think moving this issue and expanding the next milestone is a good idea.

I do have one thing to input in regards to your comment:

The purpose of the spinup is to allow vegetation to establish before the actual simulation runs, with the ultimate goal of decreasing the number of years we actually need to run the simulations.

150 years of spinup + 150 years of simulation is faster than 300 years of simulation. So if you are hoping to be able to use spinup to save time on production runs we could still see some benefits from running spinup every iteration. The reason being that spinup removes some of the time consuming I/O functionality and the accumulators.

Input / Output is a huge resource drain on computers, and because spinup doesn't have to write any files out we save quite a big of time.

The accumulators are also a big resource drain because multiplication and division are the most time-consuming atomic operations a processor can do. By not having accumulators we save precious CPU operations.

There is no denying that running one spinup for all iterations would be ideal. If you can find a solid way to sew the iterations together it would be optimal. But, if that turns out to be more trouble than it's worth, we will still see some of the benefits of spinup by running it every iteration.

kpalmqui commented 5 years ago

@chaukap thanks for the thought comments and documentation! You make some good points.

Let's continue this discussion in the near future!

chaukap commented 4 years ago

This issue has been resolved with the creation of the Initialization module. Biomass values are now non-zero with multiple iterations of simulation.

However, running multiple iterations of initialization would be a nice feature to add. Therefore, I opened an issue for it in the future functionality milestone. Refer to that issue for all future discussion.

kpalmqui commented 4 years ago

Another potential solution here is to identify the number of years for which spinup is run that would result in most functional types being established in the final year of spinup. Testing could be done to see if this would be a simple solution.

i.e. 50 years of spinup for example.