natcap / invest

InVEST®: models that map and value the goods and services from nature that sustain and fulfill human life.
Apache License 2.0
151 stars 63 forks source link

Allow layers of biophysical parameters to be used instead of LULC/table? #1388

Open phargogh opened 10 months ago

phargogh commented 10 months ago

This request has come up many times over the years, where someone (generally a researcher) wants to use an InVEST model but wants to provide their own layers of a specific biophysical parameter rather than generalizing that parameter to an LULC class. A few examples that come to mind:

While this could of course be done on a per-model basis, it'd be helpful to think about an InVEST-wide solution since we have at least the above examples.

A few related questions:

dcdenu4 commented 10 months ago

Urban Flood, SDR, and SWY cases documented in #883.

lmandle commented 10 months ago

I would expect the demand for raster inputs to increase with increasing update of natural capital accounting/GEP, where users would like to take existing information (sometimes derived from satellite imagery) to assess baseline conditions and the focus is less on scenarios.

Too bad we don't have an Olympics coming up -- this would a great topic to cover in person in a venue like that!

newtpatrol commented 10 months ago

Cool that this is being considered, and agreed that it would be best to have an InVEST-wide solution for consistency, as well as flexibility for other parameters that people haven't explicitly asked for raster versions of yet.

I'd support raster-only to make it easier on y'all, assuming that's the format that's used by most/all of the models for the purpose of mapping parameters to LULC anyway. If users have a vector layer (which I think would be a small minority of the time), they can convert to raster easily enough.

One idea could be for each model to require an input table that lists all of its parameters, with a flag for whether to use the LULC-based biophysical table for that parameter or a separate raster, and the filename for the biophysical table or separate raster. (Or a setup something like that). One thing I'd want to think about more is if/how that would affect people scripting for batch-processing, since those parameters are often being adjusted hundreds of times during sensitivity testing. Maybe this wouldn't change anything significant about that process, but in this moment I'm unsure.

dcdenu4 commented 8 months ago

Flagged as a high priority issue to focus on, but not a drop everything priority.

emlys commented 5 months ago

As the most general form of this question, you could imagine virtually all numeric inputs to invest models could be provided in any of these formats:

Theoretically you could design a model interface where these types are interchangeable, so you could provide any numeric input in any of these formats. The demand is mostly to make (b) and (c) interchangeable, so we'll stick to that, but I think it's helpful to keep this context in mind.

emlys commented 5 months ago

Assessing the scope of work...

16 models use pattern (b), a biophysical table with parameters mapped to LULC classes. ~60 model parameters are defined this way.

I count at least 6 requests to use (c) where we only provide (b). I'm not aware of any requests to use (b) where we only provide (c).

Questions to answer before implementing:

emlys commented 5 months ago

1. separate raster and table inputs for each parameter

MODEL_SPEC = {
    "args": {
        "usle_c_table": {
            "name": "USLE C table",
            "type": "csv",
            "required": "not usle_c_raster",
            "index_col": "lucode",
            "columns": {
                "lucode": spec_utils.LULC_TABLE_COLUMN,
                "usle_c": {"type": "ratio"}
            }
        },
        "usle_c_raster": {
            "name": "USLE C raster",
            "type": "raster",
            "required": "not usle_c_table",
            "bands": {1: {"type": "ratio"}},
        },
        ...

2. one input that can be either a raster or a table

MODEL_SPEC = {
    "args": {
        "usle_c": {
            "type": {"csv", "raster"},
            "index_col": "lucode",
            "columns": {
                "lucode": spec_utils.LULC_TABLE_COLUMN,
                "usle_c": {"type": "ratio"}
            },
            "bands": {1: {"type": "ratio"}},
            "name": gettext("USLE Cover-Management Factor")
        },
        ...

3. raster input, or column in biophysical table

MODEL_SPEC = {
    "args": {
        "biophysical_table_path": {
            "type": "csv",
            "index_col": "lucode",
            "columns": {
                "lucode": spec_utils.LULC_TABLE_COLUMN,
                "usle_c": {
                    "type": "ratio",
                    "required": "not usle_c_raster"
                },
                "usle_p": {"type": "ratio"}
            },
            "name": gettext("biophysical table")
        },
        "usle_c_raster": {
            "name": "USLE C raster",
            "type": "raster",
            "bands": {1: {"type": "ratio"}},
        },
        ...
emlys commented 5 months ago

Time estimates... To implement this for a single parameter as a test case: ~1 day To implement for all 6-ish parameters that this has been specifically requested: ~3 days To implement a general solution for all invest parameters where this applies: ~5-7 days

lmandle commented 5 months ago

Adding a request for this as an option for the carbon pools in the carbon model -- especially the soil carbon pool! Becky Chaplin-Kramer should be sending a list of requests, which I'll add when I get them.

davemfish commented 5 months ago

Adding a request for this as an option for the carbon pools in the carbon model -- especially the soil carbon pool!

Wait, if the carbon stock is already mapped...what's left for the model to do? 😄

lmandle commented 5 months ago

Well, some pools might be mapped and more-or-less fixed (like soil carbon) and some pools might be linked to LULC (like aboveground biomass) and vary by scenario, so this would save the time of having to create a biophysical table and LULC map reflecting all LULC x soil type combinations. (Though in some cases you might do that anyway to get at soil by LULC interactions). But yes, the carbon model is still just doing some basic raster math!

newtpatrol commented 5 months ago

FWIW, I just leave soil out of the carbon model, and add it manually in post processing, especially since ISRIC provides a relatively easy to use soil carbon layer.

And I just need to chime in here that I'm going to scream if one more person posts to the forum asking about the units for the carbon outputs (which someone did again today). If nothing else, providing per hectare values will save future us a LOT of time answering that question on the forum.

davemfish commented 5 months ago

Well, some pools might be mapped and more-or-less fixed (like soil carbon) and some pools might be linked to LULC (like aboveground biomass) and vary by scenario, so this would save the time of having to create a biophysical table and LULC map reflecting all LULC x soil type combinations. (Though in some cases you might do that anyway to get at soil by LULC interactions). But yes, the carbon model is still just doing some basic raster math!

Thanks @lmandle , this makes sense. Please forgive my snarky comment. I do think it's important to describe use-cases like this along with the requests, before adopting new features. If nothing else, it lets information bubble up that may be important to keep in mind during design & development. And it can trigger follow-up questions, for example,

lmandle commented 5 months ago

@davemfish No worries at all, your questions were totally fair! And you are so right on the importance of providing specific use cases. I was being overly general in my request. For carbon the only biophysical table input I've actually wanted to put in as a raster has been soil carbon. So maybe it would make sense to start there? Though Stacie's postprocessing is a great solution that I wish I'd done at the time!

To your question:

Is there a case where all the carbon pools are already mapped and someone would want to run the model without a table or LULC raster at all? Or is that truly beyond useful to the point where it's not a requirement to handle that case.

I suppose I could hypothetically imagine such a case, but it seems out there enough I don't think we should need to handle it.