traitecoevo / plant

Trait-Driven Models of Ecology and Evolution :evergreen_tree:
https://traitecoevo.github.io/plant
53 stars 20 forks source link

Exporting environment drivers into run_scm_collect #347

Closed devmitch closed 1 year ago

devmitch commented 1 year ago

Finally managed to fix the bug that was blocking me from exporting the extrinsic drivers in #340. Now, env is structured in the following way:

# setup parameters, environment, control objects as usual
out <- run_scm_collect(p1, env, ctrl)
out$env[[t]]$canopy  # canopy data for some timestep t
out$env[[t]]$rainfall  # rainfall amount at some timestep t (only if the environment is FF16w)

The manner in which the drivers are exported are generic enough such that if another environment driver is added, it should also be exported with no extra work.

There was also another variable discussed that should be exported but I forgot what it was called (theta...?) and where it lived. If there are any other variables we wish to export we should discuss them on this PR and I can add them.

codecov-commenter commented 1 year ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (develop@fc2a44c). Click here to learn what that means. The diff coverage is n/a.

@@            Coverage Diff             @@
##             develop     #347   +/-   ##
==========================================
  Coverage           ?   79.78%           
==========================================
  Files              ?       97           
  Lines              ?     8916           
  Branches           ?        0           
==========================================
  Hits               ?     7114           
  Misses             ?     1802           
  Partials           ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

devmitch commented 1 year ago

Well, it looks like all the tests are passing, however some of the vignettes are not buildable since tidy_patch now has a different (and probably incorrect) output for env:

res <- run_scm_collect(p1, env, ctrl)
tidy <- tidy_patch(res)
> filter(tidy$env, step==99)
# A tibble: 121 × 5
    step  time patch_density rainfall canopy[,1]  [,2]
   <int> <dbl>         <dbl>    <dbl>      <dbl> <dbl>
 1    99  20.0        0.0235     3.14      0     0.239
 2    99  20.0        0.0235     3.14      0.225 0.239
 3    99  20.0        0.0235     3.14      0.450 0.239
 4    99  20.0        0.0235     3.14      0.674 0.239
 5    99  20.0        0.0235     3.14      0.899 0.239
 6    99  20.0        0.0235     3.14      1.01  0.239
 7    99  20.0        0.0235     3.14      1.12  0.240
 8    99  20.0        0.0235     3.14      1.18  0.240
 9    99  20.0        0.0235     3.14      1.24  0.240
10    99  20.0        0.0235     3.14      1.29  0.240
# … with 111 more rows

The two last columns should be labelled height and canopy_openness. I think the solution involves changing tidy_env() in tidy_outputs.R to deal with the fact that the canopy data is now a level deeper in the env output.

dfalster commented 1 year ago

Nice work @devmitch .

@itowers1 - can you try this out, and advise on anything else needed?

itowers1 commented 1 year ago

Hey Mitch, great work, really exciting to have this available! The other variable is indeed theta, its the amount of soil moisture available in the soil. It's calculated in ff16_environment.h and is in the environment object. It might also be good to collect psi_soil. psi_soil is calculated directly from theta, but is dependent on soil parameters, so would change depending on what the inputted soil parameters are.

dfalster commented 1 year ago

Also, currently theta is a single number, but we're set the model up to have multiple soil layers, so in future theta would be a vector of points, each one at a different soil depth. So best to implement with that ability from the outset.

devmitch commented 1 year ago

I think simple_use_water_model is the latest branch with all the new water code. I've updated that branch and added theta to the env output, but it is difficult to test since full runs seem to take forever and something is off when setting soil_number_of_depths to be > 1.

itowers1 commented 1 year ago

Hey Mitch,

Yep, tests will be a bit tricky on that branch at the moment, still a lot of stuff to work out, but thanks for doing that! I'll test it out now.

itowers1 commented 1 year ago

Hey Mitch, just ran test-strategy-ff16w.R on the simple_water_use_model branch, unfortunately it looks like make_environment("FF16w") is broken on this branch, not sure if you found the same thing?

Received this error Error in FF16_Environment__ctor(canopy_rescale_usually, soil_number_of_depths, : argument "PPFD_" is missing, with no default

Looking into it, I can see that we have added PPFD_ as an argument to the environment constructor without a default, and it is not receiving an argument when running FF16w_make_environmnet, which I think is causing the issue. I'm wondering if the best course of action here is to remove PPFD as an argument from the constructor, or to provide it with a value. Bit unclear on how to approach this.

itowers1 commented 1 year ago

The default PPFD value (0) was brought in on commit e547ce8f70c263e52a27ede63443149c78d4bf59. I'm a bit confused about what it's doing at the moment, sorry. @aornugent was just wondering if you could give me a hand with this? Was this just to set up for a future scenario where we could define PPFD from FF16_make_environment?

dfalster commented 1 year ago

The environment objects are used to run single individuals outside of a patch context. So they need to have all the variables that one would expect to find in the environment of a patch.

itowers1 commented 1 year ago

The environment objects are used to run single individuals outside of a patch context. So they need to have all the variables that one would expect to find in the environment of a patch.

Sure, that makes sense. I'm just trying to get to the bottom of why FF16w_make_environment is failing, which I've traced back to the the fact that FF16Environment needs a PPFD value. I don't believe that this PPFD_ value is currently being used in ff16_environment.h in any case, even as above-canopy downward radiation, instead being hard coded as double PPFD = 1000 which is then read by FF16w_strategy.cpp.

aornugent commented 1 year ago

Hey @itowers1 - the fast fix for this is to comment PPFD out of the RcppR6 API #L1042.

A more complete solution is to make the necessary change to FF16_Environment and the associated make_environment_* functions - this would allow you to move your hard-coded value into R as a default function argument and override it on a whim.

aornugent commented 1 year ago

Nice work @devmitch! Given that theta doesn't exist yet on develop are you happy we merge this and open an issue to track the extension you've made on the simple_water_use_model branch?

I can't see anywhere this is tested - could you add an extra test for FF16 (canopy) and FF16w (canopy + rainfall) in test-scm-support.r. Currently it returns NULL if the requested parameter isn't in the run_scm_collect output which I think is fine.

devmitch commented 1 year ago

are you happy we merge this and open an issue to track the extension you've made on the simple_water_use_model branch?

Well the CI actions aren't passing because the tidy functions are still incorrect (funnily enough the tidy functions aren't actually tested in the test suite, but they are used when building the documentation in the github actions), though yeah we could probably isolate the work on the other branch to another issue.

I can't see anywhere this is tested - could you add an extra test for FF16 (canopy) and FF16w (canopy + rainfall) in test-scm-support.r. Currently it returns NULL if the requested parameter isn't in the run_scm_collect output which I think is fine.

Sure, I'll add some now.

aornugent commented 1 year ago

Ahh - it appears this chance also breaks some of the tidy_patch workflow described in the vignettes e.g. #L213-219. EDIT: comment ninja'd by @devmitch.

This can be fixed with something like:

res %>%
 pluck("env", 99, "canopy") %>% 
 data.frame() %>%
   ggplot() +
     geom_line(aes(x=height, y = canopy_openness), size=1.5) +
     theme_classic()

The pluck step isn't quite so intuitive and I'm not quite sure why we now need to cast to a data.frame but nevertheless it now works.

EDIT2: Oh I see - I tested my code without running tidy_patch first. Will suggest a better patch shortly.

aornugent commented 1 year ago

@dfalster - extending tidy_env is actually rather tricky. The current behaviour is to concatenate the canopy lists such that we have a long data.frame for each timestep, describing the canopy openness along the height distribution. Rainfall on the other hand is a single value for each time step. Theta will have many values for each timestep but not as many as canopy. Trying to force this into a dense dataframe will result in a lot of NAs or duplicate values.

This almost gets there:

tidy_env <- function(env) {
  # get list of variables
  env_items = names(env[[1]])

  # extract over each variable, concatenating by time, 
  # then join all variables by step 
  env_long <- env_items %>%
    purrr::map(., function(e) map_dfr(env, ~ pluck(., e) %>% data.frame, .id = "step")) %>%
    purrr::reduce(left_join, by = "step") %>%
    dplyr::mutate(step = as.integer(.data$step))
}

This has two shortcomings: 1) rainfall isn't correctly labelled because it's a scalar (i.e. doesn't have column names like canopy) 2) it duplicates the rainfall value for each height entry

str(env_long)
'data.frame':   10766 obs. of  4 variables:
 $ step           : int  1 1 1 1 1 1 1 1 1 1 ...
 $ .              : num  45 45 45 45 45 45 45 45 45 45 ...
 $ height         : num  0 0.0123 0.0245 0.0368 0.049 ...
 $ canopy_openness: num  1 1 1 1 1 1 1 1 1 1 ...

This second point will be exacerbated when we have multiple soil values.

An alternative is to return a list of tidy dataframes, one for each env. variable. It's not quite so puritanically tidy but maybe easier to work with downstream?

tidy_env <- function(env) {
  env_items = names(env[[1]])

  env_long <- env_items %>%
    purrr::map(., function(e) map_dfr(env, ~ pluck(., e) %>% 
                                        data.frame, .id = "step") %>%
                 dplyr::mutate(dplyr::across(step, as.integer)))

  names(env_long) <- env_items
}

This would allow users to filter, plot and join, but needs an extra step of unnesting the data out of the list. Rainfall however is still not correctly labelled.

str(env_long)
List of 2
 $ rainfall:'data.frame':   142 obs. of  2 variables:
  ..$ step: int [1:142] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ .   : num [1:142] 45 45 45 45 45 45 45 45 45 45 ...
 $ canopy  :'data.frame':   10766 obs. of  3 variables:
  ..$ step           : int [1:10766] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ height         : num [1:10766] 0 0.0123 0.0245 0.0368 0.049 ...
  ..$ canopy_openness: num [1:10766] 1 1 1 1 1 1 1 1 1 1 ...

Appreciate your opinion on the best way forward!

dfalster commented 1 year ago

Thanks @aornugent, I like the second approach. There's limited usage of the env component of tidy patch outputs so far, so there's limited cost of changing the interface.