NGEET / fates

repository for the Functionally Assembled Terrestrial Ecosystem Simulator (FATES)
Other
99 stars 91 forks source link

Seed dispersal in FATES #471

Closed evalieungh closed 11 months ago

evalieungh commented 5 years ago

Is it possible to add seed dispersal in FATES?

At the CTSM 2019 tutorial, @rosiealice @jkshuman @rgknox @ekluzek, me and others started discussing options for integrating dispersal mechanisms in FATES. What is the best way to represent dispersal in FATES in accordance with ecological theory? What options are there for implementing this in FATES and/or CTSM/CESM?

Part of my PhD project is to look into modelling dispersal of ecosystems, and I'm hoping to use FATES to represent ecosystems somehow and maybe try to implement dispersal. @huitang-earth

As it stands, PFT dispersal is assumed to be perfect, i.e. when the environmental conditions and an opening exist, the PFT will grow anywhere.

rosiealice commented 5 years ago

@ekluzek sent me a great assessment of the software considerations of this over the weekend.

"At yesterday's tutorial, you asked me about MPI and a dispersal model for CESM/CTSM that would disperse various things like fire or contagions from one grid cell to the next. Because of how CTSM is setup this isn't an easy thing to do. But, it also sounds like this is a scientific direction that's important for a variety of things. So having a general solution for this sounds like something important to think about and start planning for. So I'm going to list some of the most important things about this.

So here are my bullet points...

Here's more details on those points:

The goals with MPI are to: reduce the total communication, and divide up the work

CTSM runs fast with MPI because we don't have to communicate between grid cells. So the communication it requires from that perspective is zero. And there is plenty of work to divide up because you can divide it up down to individual gridcells. The main problem we have with CTSM is that the actual work isn't perfectly divided up. Anyway, the point is that CTSM is great for MPI because communication is zero and you can divide the work quite well.

Technically you could add extra infrastructure to CTSM to do MPI communication from one cell to the surrounding grid cells. But, the CTSM decomposition is setup to randomize grid cells on processors. We intentionally try to put grid cells on the opposite side of the globe for example. This means that almost every grid cell is going to be communicating with other processors regarding its neighbor cells. So this solution maximizes required communication in order to divide the work.

CAM for example, can advect tracers of various sorts through the atmosphere. And it does so with 3D fluid flow, so it can do the job right. So if the dispersal needs to take into account wind we need to think if this isn't an atmosphere process. Now, to run with CTSM standalone, that might mean we'd need to modify DATM to handle simpler surface dispersal.

I think the important point here is to decide what the list of requirements are for this. This could also be coupled directly into CESM as another "component", but obviously that's a bigger conversation.

If we decide it's part of the land model, I think the only thing that makes sense is to do MPI communication from the CTSM decomposition to a simpler 2D grid where the decomposition is made up of squares. Now, a part of this is that the communication cost is only worth it, if you get a gain from dividing the work. If the work is smaller you might not want to divide it up into the same number of processors for CTSM, but a smaller subset. But, adding infrastructure to do this allows you to try different things.

So for example you could gather all CTSM grid cells to a single processor and run the dispersal code on it. If the work is small but communication high it might be faster to do that than to spread it up into more than one processor. And actually we do have the infrastructure in CTSM to do a gather and scatter to one processor already. So this could be implemented sooner. A problem with it is that then memory won't scale with processor count as it currently does. But, if it's only done for a few fields that's not too big of a problem."

rosiealice commented 5 years ago

...and then @billsacks replied

"I worked with @slevisconsulting a couple of years ago on something similar: beetle dispersal. I sketched out a general algorithm that I think amounted to one of your last points in this email – basically, doing a global gather to the master proc – and Sam implemented this. Probably not ideal – particularly in terms of performance – but it got the job done, and something like that could be reused at least for initial prototyping and scientific development."

rgknox commented 5 years ago

Thanks for creating this thread @evaleriksen . And @ekluzek, you put a lot of good thought into this, thanks!

To re-iterate @ekluzek 's point 2. The current decomposition seems fairly random, so if we wanted to benefit from node-to-node communications, we would gain more if we changed the grid decomposition to something with spatial structure.

In response, a point about FATES and ED-like models. Since FATES uses dynamic allocation of cohorts, regions with lots of biodiversity and multilayered canopies (tropical forests) would potentially have many more (orders of magnitude) cohorts than places like deserts. In ED2 we made our domain decomposition scheme balance according to these expected cohort loads, which I'm guessing, giving the frequency of communication needed for seed dispersal, may be more important for efficient runs.

However, in a coupled simulation, wouldn't the land-grid decomposition be tiled, to more efficiently communicate with the atmosphere (which is tiled right?)

rosiealice commented 5 years ago

Eunjee Lee did implement something along these lines, I think during her PhD. I have a recollection that she did all the simulations on a single processor. There might be some useful stuff to build off scientifically here... https://dspace.mit.edu/handle/1721.1/69469

ekluzek commented 5 years ago

In the LMWG meeting today, Marje Prank talked about "Modeling the impacts of climate and land use change on the emission and transport of rust spores". From her plots she obviously hooked up the transport of the rust spores to the atmosphere model. Asking her what she did could be useful. She also noted that in a few days the spores could transport across oceans. So the smaller the particle the more important being properly hooked into the atmospheric flow will be. If transport across oceans is important doing that may be a requirement.

slevis-lmwg commented 5 years ago

I don't think my beetle work made it into a branch, so you would need to talk to Jeff Hicke (U of Idaho) who owns the beetle model if you decided to go that route.

...and then @billsacks replied

"I worked with @slevisconsulting a couple of years ago on something similar: beetle dispersal. I sketched out a general algorithm that I think amounted to one of your last points in this email – basically, doing a global gather to the master proc – and Sam implemented this. Probably not ideal – particularly in terms of performance – but it got the job done, and something like that could be reused at least for initial prototyping and scientific development."

evalieungh commented 5 years ago

Thank you for all the useful input to this discussion. I have thought about this a bit more, and although I'm not sure I understand the technical bits I have some thoughts on what we need to think about when choosing a solution, and sketched a rough first idea. Please take the suggestions with a grain of salt, but the principles of mechanistic solutions I think is important. Here goes, a list of things to think about before going forward:

What I'm imagining right now as a first concept, is that based on a cohort's growth it allocates a certain share of its growth to seed production. Seed dispersal distances are drawn (non-randomly, somehow, to avoid stochasticity) from a PFT-specific dispersal kernel*. If some seed travel distances are larger than a threshold**, they are sent outside FATES and land in another gridcell. The seeds that remain inside the gridcell limit how many new cohorts that can appear in the next relevant time step.

*E.g. a gaussian curve for wind dispersed seeds, this could also be a multi-topped distribution because a lot of plants have multiple modes of dispersal. Finding realistic dispersal kernels for PFTs will probably be difficult, but I think there should be enough literature to make some rough approximations. **An idea is to take the average distance of all points within the gridcell to its nearest edge

Does this sound plausible to you at all? I'm sure it will take a lot of effort to get there, but do you think something like this is doable, sensible and at some point worth the computing power? Also, I'm trying to think of how this will not introduce stochasticity but I'm not sure if I've thought it through well enough.

TL;DR / summary: I think we should separate bewteen-gridcell and within-gridcell dispersal of seeds, where the latter might be useful to enable coexistence of PFTs within gridcells and the former could be implemented outside of FATES somehow. We need to consider different modeling purposes and keep in line with ecological theory.

rgknox commented 5 years ago

@evaleriksen , I'm currently re-factoring patch level mass fluxes in the model to accommodate nitrogen and phosphorus. As part of the refactor I've added some terms (variables) to track external seed inputs (ie from other gridcells). (now that I think of it, we need to have an outflux term too..) It is only a placeholder, but it should help provide a starting point to whomever adds the grid-level dispersal algorithm.

rosiealice commented 5 years ago

Hey @evaleriksen, I came across this and thought it might be of some interest...

https://academic.oup.com/aobpla/article/11/5/plz042/5559435

evalieungh commented 5 years ago

Thanks for the tip, Rosie. That paper is definitely worth reading. I've put FATES a bit on the shelf for a while but I'm still interested in looking at the dispersal code once I get a bit further in my PhD.

ekluzek commented 3 years ago

We had some more discussion of this in the context of the spring CESM LMWG meeting. There was a talk that included lateral flow of water between gridcells every time-step. That's actually a much higher bar than anything that FATES would want to do. But, I do want to encapsulate some ideas that we had in some of our discussions. This includes ideas from Bill Sacks...

adrifoster commented 3 years ago

Hey all,

Just to put my two cents in from my work with seed dispersal in a much less complex model. I also use dispersal curves (as in https://esajournals.onlinelibrary.wiley.com/doi/abs/10.2307/2265633) but with a "fat tail" to model wind-disperal and to take into account low probability long-distance events, so currently not taking into account eddies, etc.

Though I think it might be easy to modify the curve in different directions depending on the average (on whatever time step you are working on) prevailing wind direction/speed. Here's a paper that might be useful for incorporating wind https://www.nature.com/articles/s41558-020-0848-3.

The dispersal kernels in my model are species-specific (really just genus-specific right now due to lack of available information). I also determine a minimum dispersal density to actually consider, which would map to the number of surrounding gridcells you communicate with based on gridcell size and your dispersal equation. This way each gridcell only communicates with gridcells it could potentially seed/or receive seed from.

slevis-lmwg commented 3 years ago

...and then @billsacks replied

"I worked with @slevisconsulting a couple of years ago on something similar: beetle dispersal. I sketched out a general algorithm that I think amounted to one of your last points in this email – basically, doing a global gather to the master proc – and Sam implemented this. Probably not ideal – particularly in terms of performance – but it got the job done, and something like that could be reused at least for initial prototyping and scientific development."

An update about this...

The work that I did is publicly available on github. You can see the code modifications with this link.

For the dispersal across grid cells, search for mpi_allreduce in subroutine dynProgBB in /biogeochem/dynHarvestMod.F90. All the relevant code modifications are in dynHarvestMod, decompMod, and decompInitMod.

I performed very rudimentary testing of the code at the time, so no guarantees that it works correctly. I checked in with Jeff Hicke a few moments ago, and he felt it should also be clear that this version of the beetle model is out of date.