Have a canonical biome list defined in INI file

bpbond commented 5 years ago

that then can be iterated over generally. Proposed by @rplzzz here https://github.com/JGCRI/hector/pull/285#discussion_r251205251

ashiklom commented 5 years ago

@rplzzz @bpbond I'm approaching a point in my Hector permafrost work where I'll need to rely more heavily on the multi-biome functionality (including adding a new methane emissions component). I can hack together a one-off solution on my own fork of Hector, but I would personally rather do it properly and in a way that makes it back into the Hector mainline.

I'd like to start a more formal brainstorming discussion with you about the right/best way to do multiple biomes in Hector. Here are two possibilities:

The simplest to implement would probably be to store all the biome names in a character vector, over which we iterate in any biome-specific code like this: https://github.com/JGCRI/hector/blob/81f1c0bcf34802695f1f220d581664c2acc53e93/src/simpleNbox.cpp#L910-L913

...we would have code along the lines of:
```
for (itd = biomes.begin(); itd != biomes.end(); itd++) ...
```
This lets us keep the existing variable structure, where each biome-specific parameter and state variable is a named map. We can check the validity of all parameters by adding a few more checks to this block: https://github.com/JGCRI/hector/blob/81f1c0bcf34802695f1f220d581664c2acc53e93/src/simpleNbox.cpp#L336-L346

A new_biome message would initialize a biome with zero values for all biome-specific pools and the same parameters as the first biome in the current list of biomes. Those pools could be manipulated directly by setvar messages. We could have a few R helper functions that use multiple fetchvar/setvar statements for the common use case of partitioning the various pools between biomes.

A delete_biome message would remove a biome from the biome list, and remove the biome-specific versions of every parameter and state variable. An R wrapper around this would provide several options for what to do about any C that may be in those pools (e.g. a reasonable default might be to first transfer that C into a different biome, which might default to the first biome in the biome list).

This doesn't really address #19, but might make it easier to address when we do get around to it. However, this might be an opportunity to address #20, since we could check for the presence of a biome named "global" and throw an error if such a biome co-exists with any other biomes.
A more complex alternative would be to flip the code design, such that "biome" becomes a class that encompasses all of its parameters and state variables. All of the current functions that take "biome" as a method would instead become methods of a "biome"-class object. The biome creation and deleion functions described above could be a part of the object initialization and garbace collection, respectively. The more I think about it, the less it seems like a good idea, especially given the current code structure. But, it might have certain advantages that I haven't thought of.

bpbond commented 5 years ago

Thanks for opening this conversation @ashiklom . I agree that, if we were designing from ground zero, the second more complex option could be attractive.

But we're not, obviously, and there would be a real cost (in time sunk at least) to inverting the whole approach, and the benefit isn't clear to me. Option 1 seems both simpler and would have everything we need to address this issue and (as you say) #20 .

JGCRI / hector

Have a canonical biome list defined in INI file #286