Open KRNA01 opened 2 years ago
HI @KRNA01
Could we make coralScenario()
return just the raw data (i.e., what is currently kept inside all
), and calculate the other aspects after the fact?
The current data structure is a little problematic for larger set of runs and increases file sizes when storing results to disk. One implication is larger transfers to/from a remote computer (e.g., workstation or HPC).
My current envisioned approach is shown in #58 but repeated here for convenience:
Here, Y
would be equivalent to all
.
% Run a set of simulations with sampled values of size N*D, where N is number of rows,
% with each row holding values for a simulation.
% As `nreps=8` here, this would produce a result set of N*8 in total
Y = ai.run(sampled_values, sampled_values=true, nreps=8)
% Collect a set of metrics by passing in an array of functions
a_struct_of_separate_metric_values = collectMetrics(Y, ...
[@some_metric_function, @another_metric_function, @yet_another_metric])
The other concern I have currently is that there is a lot of hardcoded values. Are we absolutely definitely 100% sure we won't be adding/removing coral groups/species? Okay if we're not, but otherwise I would prefer that we generalize as much as possible.
Hi @ConnectedSystems,
Thanks for this, and sorry if I missed #58 before posting this.
Yes, very happy for coralScenario to just return the raw data.
Will wrap my head around #58 now to figure out how we can shift to your approach. I am not vetted to the data structure I have used for the metrics, so happy to shift.
Agree re hardcoded values. I'm not 100% sure we won't be adding species later this year, so a good time to agree on a generalised format.
Hi @KRNA01, Thats great, is there interest adding these metrics to a BBN for inference? If so I can start writing a script.
Yes, very happy for coralScenario to just return the raw data.
Will wrap my head around #58 now to figure out how we can shift to your approach. I am not vetted to the data structure I have used for the metrics, so happy to shift.
Great, thanks @KRNA01 . I think once we have the metrics finalized I can wrap up the distributed data read functions and then get cracking with analyses, at least for Moore Reef, using the example DHW/wave data.
% Collect a set of metrics by passing in an array of functions
a_struct_of_separate_metric_values = collectMetrics(Y, ...
[@some_metric_function, @another_metric_function, @yet_another_metric])
@ConnectedSystems, can you point me to an example (or a MatLab help page) that speaks to calling functions as input for other functions, please? I'd like to be sure I fully understand the rules before I code this.
As you're using '@' as a prefix, I'm thinking you're looking for anonymous functions for metrics rather than function m files? If so, or not, please advise.
Some of the derived metrics, especially the reef condition index, can probably only be used as an m files as it requires data input e.g. from expert knowledge elicitation.
If we can simply refer to the function m files in the input line (without the '@' prefix), then that's easier. If so, are we talking nested functions?
Hi @KRNA01
I'm thinking you're looking for anonymous functions for metrics rather than function m files? If so, or not, please advise.
Not in this case - the '@' prefix is necessary to inform MATLAB that you're passing in references to functions (function handles), otherwise it thinks they are non-existent variables.
The idea I had is that inside collectMetrics()
(or whatever we end up calling it) will be a loop over Y
and we call the arbitrary number of functions in turn.
Or a loop over the list of functions applied to Y in turn...
Obviously I haven't thought this through completely.
Hi @ConnectedSystems,
Great - this helps - thank you.
@KRNA01
Some of the derived metrics, especially the reef condition index, can probably only be used as an m files as it requires data input e.g. from expert knowledge elicitation.
Hmm... for this I would use a function handle to compose the desired metric. Something like:
metric_to_apply = @(Y) metric_function_defined_in_file(Y, other_parameters)
collectMetrics(Y, [@metric_to_apply])
@ConnectedSystems, The input data are simply a data table. Could we just refer to this as a file, or stick the data in the function m file?
How big of a table are we talking? If somewhat small and unlikely to change, I say stick it in the function definition.
Otherwise, if it might change, would we want to examine the influence of uncertainty in expert knowledge at some point?
expert_knowledge1 = ... % load some data file
expert_knowledge2 = ... % load some alternate viewpoints
for ek = [expert_knowledge1, expert_knowledge2]
func = @(Y) metric(Y, ek)
Y_ek_i = collectMetrics(Y, [@func])
end
Or is this unlikely to happen (or be a problem for the ADRIA team 5 years from now)?
@ConnectedSystems, a couple of tables for the reef condition index, 7 by 4 each. Agree, let's stick them in the function def
HI @KRNA01
Could we make
coralScenario()
return just the raw data (i.e., what is currently kept insideall
), and calculate the other aspects after the fact?The current data structure is a little problematic for larger set of runs and increases file sizes when storing results to disk. One implication is larger transfers to/from a remote computer (e.g., workstation or HPC).
My current envisioned approach is shown in #58 but repeated here for convenience: Here,
Y
would be equivalent toall
.% Run a set of simulations with sampled values of size N*D, where N is number of rows, % with each row holding values for a simulation. % As `nreps=8` here, this would produce a result set of N*8 in total Y = ai.run(sampled_values, sampled_values=true, nreps=8) % Collect a set of metrics by passing in an array of functions a_struct_of_separate_metric_values = collectMetrics(Y, ... [@some_metric_function, @another_metric_function, @yet_another_metric])
The other concern I have currently is that there is a lot of hardcoded values. Are we absolutely definitely 100% sure we won't be adding/removing coral groups/species? Okay if we're not, but otherwise I would prefer that we generalize as much as possible.
Hey @ConnectedSystems and @KRNA01, Following this, will calculation of C1 to C4 eventually be removed from coralCovers? I'm just implementing calculating metrics like evenness as output in the optimisation (which needs C1 to C4) and the outputs containing C1 to C4 (Y.tab_acr etc) are not available because they are not saved in runCoralADRIA but still referred to in coralCovers. Should I assume that I'll need to calculate them just using the Y.all output from runCoralADRIA?
Hi @Rosejoycrocker
That would be my preference, yes.
To reiterate, I envision Y
to be a matrix (that is, Y
returned by ai.run()
would be the current Y.all
) and all metrics to be derived from that as and when needed.
If we are in agreement, we could write generic functions to extract taxa specific subsets (e.g., Y.tab_acr
), ideally without hardcoding array index positions.
Thanks @ConnectedSystems , I'll assume this then. To avoid hardcoding array positions, would you have the number of taxa as input or something?
Thinking through this for something else as well.
coralSpec()
produces a table that indicates the coral -> parameter mapping. The taxa_id positions should match up with what is recorded in the raw results. Let me know if you need any help/input.
Hi @Rosejoycrocker and @ConnectedSystems,
Good timing. I have just updated the coralCovers function along with coralEvenness and shelterVolume. I'm proposing we keep the coralCovers as a function because we need it to calculate Evenness etc. You'll see that I have now attempted to align these functions with what Takuya proposed for #58 - calculating a suite of metrics from one function and the output from the 'coralScenario()', which is simplified to 'Y'. Still a bit of way to go, but keen to hear your advice and comments at this point.
Please note that the Evenness function has an error (which I haven't found yet), because Evenness should range from 0 to 1. Would welcome sharp eyes and brains to ID the issue.
Also note that shelterVolume is now a proxy for structural complexity. When we compile the Reef Condition Index (RCI) to inform the scope of interventions to support existence values, we'll be drawing on (1) total coral cover, (2) shelter volume, (3) cover of juvenile corals, and (4) and Evenness. RCI is next for me to complete. @Rosejoycrocker, this will give you four direct and one derived (RCI) metrics to report and analyse for BBNs and trade-off analyses etc.
@ConnectedSystems and @Rosejoycrocker,
Just saw these last suggestions re not hardcoding reference position. Happy to simplify everything to Y. In this case we could abolish the 'coralCovers()' function and extract what we need from Y in each of the metric functions. Please let me know and I can update.
Thanks @KRNA01, that's great :)
To get the new functions, should I merge 40corals -> coral-optimisation-merge, @ConnectedSystems ? Just checking so I don't merge in the wrong direction or something
@ConnectedSystems and @Rosejoycrocker,
Just saw these last suggestions re not hardcoding reference position. Happy to simplify everything to Y. In this case we could abolish the 'coralCovers()' function and extract what we need from Y in each of the metric functions. Please let me know and I can update.
Hi @KRNA01, only saving Y may be good as then in the optimisation context the other metrics are only calculated if being optimised for, reducing computational expense if not all metrics are necessary for optimisation.
Thinking through this for something else as well.
coralSpec()
produces a table that indicates the coral -> parameter mapping. The taxa_id positions should match up with what is recorded in the raw results. Let me know if you need any help/input.
Thanks @ConnectedSystems , I'll have a look at coralSpec and see if I can write an indexing function. It may not match other uses you intend for it so feel free to alter once I've pushed it.
Thanks @KRNA01, that's great :)
To get the new functions, should I merge 40corals -> coral-optimisation-merge, @ConnectedSystems ? Just checking so I don't merge in the wrong direction or something
Sure, go ahead @Rosejoycrocker :)
@KRNA01
Just saw these last suggestions re not hardcoding reference position. Happy to simplify everything to Y. In this case we could abolish the 'coralCovers()' function and extract what we need from Y in each of the metric functions. Please let me know and I can update.
I think coralCovers()
is fine, we can just replace those hardcoded indices with the function @Rosejoycrocker is writing up :)
@Rosejoycrocker and @ConnectedSystems,
Quick question about coralCovers() and coralEvenness.()
How do you feel about applying the evenness measure to the six coral groups rather than the four coral taxa?
My rationale for suggesting the six is that it enables us to use evenness as an indicator of potential dominance by enhanced corals, in effect to detect invasive behaviour. Evenness will then not be true ecological evenness among ecological species, but a mixture of that and genotype evenness or dominance.
Hi @KRNA01, I don't really know enough about the ecology, but I'm guessing an indication of invasive behaviour would be useful in assessing overall reef health and would complement Total cover in terms of output representations. It could be a cool graphic to have an animated map of coral groups over time. Would the overall evenness still be be used as a biodiversity indicator, and hence be usable in ES translations?
Hi @Rosejoycrocker,
Would the overall evenness still be be used as a biodiversity indicator, and hence be usable in ES translations? <
It could confound things slightly. Let's stick with the four groups to keep it clean. We might then add another metric that helps us track how enhanced corals fare relative to unenhanced conspecifics.
Good plan @KRNA01 :)
Hi @KRNA01 and @Rosejoycrocker
Please see updated use of metrics in the single_scenario
script.
I had to adjust things to get things working smoothly so if you've made any changes you may run into issues.
Hi @Rosejoycrocker
I've merged coral-optimization-merge
into 40corals
now. If you have any potentially breaking changes incoming please commit those to a separate branch :)
@KRNA01 all changes made to the metrics should be available to you now. Please see the single_scenario.m
example script.
Hey @ConnectedSystems , I was still working on the optimisation functions- should I keep working on these in the coral-optimisation-merge and then merge into 40corals or work on them in 40corals? They shouldn't break anything, I just need to make those changes you made to implementing parameter selection.
I'd just create a new branch off 40corals and commit your changes there 😺
@Rosejoycrocker : found the issue with total coral cover no longer being proportional/relative when optimizing or running batched simulations.
The metrics themselves (coralTaxaCover()
, etc) expect Y
to be results for a single simulation (i.e., one scenario/replicate pair). If you pass in a set of results (e.g., 1 scenario, 50 replicates) then it sums across all the simulations.
This explains why it works in single_scenario
but not anywhere else.
I'll spend some time tomorrow thinking how to properly collate results in this case. I'm not currently sure if I should adjust the metrics to be agnostic to the number of simulations or if the adjustment should be made in the collection function... some pros/cons to consider.
@ConnectedSystems, @Rosejoycrocker and @ryanheneghan ,
ReefConditionIndex (for existence value) now added to 40Corals branch. One thing to note here is that all metrics functions are currently only set with enough dimensions to handle single scenarios (interventions) and simulations. @ConnectedSystems, how do you propose we convert such that we can run for multiple interventions and simulations also?
Hi @KRNA01
I had a look at the RCI function today. We'd want an RCI value for every single run of ADRIA, so I have to think on how to do this without affecting your current structure too much.
In the worse case it will be a big giant loop that goes over all results to produce a vector of RCIs for each run.
Thanks @ConnectedSystems, If it helps then I'd be happy for RCI to be a continuous metric between 0 and 1 in all the simulations. That is, we can ignore the RCI step-scale in the intermediate calculations until we present synoptic results.
@ConnectedSystems and @ryanheneghan, are we happy to close this now or do we have more to go? I can imagine we'd want to address evenness, shelter volume and other derived metrics (eg colour diversity) further later. Should we tackle those separately in new issues?
Dear @ConnectedSystems , @Rosejoycrocker, @veroniquelago and @BarbaraRobson,
We now have a small but robust subset of metrics we can use to inform ecosystem service provision from ADRIA simulations.
Jointly, these four metrics are a significant step towards building a Reef Condition Index, which speaks, in part, to existence, tourism and fisheries values. The new coral function in ADRIA allows us to produce these performance metrics. The advantage is that our outputs can now be translated to benefits for reef and people to a greater degree than coral cover can achieve on its own.
I'm adding this issue here mainly as a heads up, but also to start a discussion about how far we take these metrics. Metrics for ReefMod are richer because there are more variables to play with (e.g. crown-of-thorns starfish, algae and rubble), but also more uncertainty from the expert elicitation of derived metrics (fish species richness etc). My suggestion is we keep translating ReefMod data using these derived metrics, but then keep it more robust for ADRIA model simulations - i.e. smaller but less uncertain set of metrics. Keen to discuss.
For your interest, here's an example output of these four metrics for a single_scenario.m run in the 40Corals branch using default values. The climate scenario is RCP 4.5, enhanced tabular Acropora and enhanced corymbose Acropora are seeded, but without assisted adaptation. Trajectories are for individual reefs. Note how coral evenness (a proxy for functional diversity) peaks mid-way for a subset of sites. Shelter volume is a proxy for structural complexity and scope for habitat provision (think trees in forests).
And here are the trajectories for the six coral groups making up the metrics in the first four panels. The dynamics of the six size classes are not shown here. Note that the enhanced groups (enhanced tabular and corymbose Acropora) only start with seeded juvenile (2-5 cm diameter) corals, whereas the other groups start with a mixture of size classes.
Looking forward to working with you be able to produce runs for a range of interventions, scenarios, reef settings and R&D + intervention assumptions :-)