forsys-sp / forsysr

An R implementation of the ForSys program
GNU General Public License v3.0
8 stars 3 forks source link

Improve how SPM & PCP are handled #73

Closed michelledayusfs closed 1 year ago

michelledayusfs commented 1 year ago

The parameter scenario_output_fields summarizes treatment at the project or planning area scale. We have written previously that:

"A list of the desired fields summarized by project or planning area. Project ID, priority weights and treatment rank are added automatically."

In the past we had to add PCP field manually. Is this now automatic or if I want timber volume and timber volume PCP do I need to write: c("TimberVol_STND", "TimberVol_PCP")

codyevers commented 1 year ago

Technically, PCP and SPM values are calculated for all fields specified in the stand_pcp_spm parameter in forsys::run.

By default, stand_pcp_spm is set to NULL, in which case forsys::run calculates the PCP and SPM values for any fields set in scenario_priorities. Basically stand_pcp_spm overwrites the default behavior if specified (typically it is not).

PCP and SPM are always calculated for any fields set in scenario_output_fields.

The priority fields plus the SPM suffix are used to determine the number of priorities and therefore the weight matrix used for running different weighting scenarios.

codyevers commented 1 year ago

FROM MICHELLE ON JAN 6th: I think we need to elevate some issues in ForSysR.  Specifically, operations that happen “behind the scenes” without the user knowing.  It makes troubleshooting close to impossible.

First of all, we need to turn off the automated SPM/PCP calculations that happen behind the scenes.  I know this is an existing thing we have talked about but it didn’t have a specific issue assigned to it.  Because this happens without the user knowing, they cannot see it, and forsys is not reporting these values.  For example, it never spits out the SPM values and that is what it is basing the weights on, so it would be helpful to see those things.  I know this affects forsys.app but right now that is an issue we can kick down the road.

Secondly, we need a “check box” or parameter that identifies whether something is area adjusted or not for Patchmax just like ForSysX.  I know I pointed out that when prioritizing revenue, patchmax was going to the wrong stands, but having those values area adjusted in the background isn’t the solution.  Right now I can’t figure out what it is doing to a variable like probability.  For my dataset if I have a large stand with a probability of flame length >8ft, patchmax is probably going to normalize it “behind the scenes” which I assume has no effect since it is already on a 0 to 1 scale (not sure if that is true, but let’s assume), then it is going to area adjust it: say I have a value of 0.8 in a 100 acre plot, now my value is 0.008.  In another 1 acre stand I have the same probability, 0.8, so now my area adjusted value is 0.8.  In this case, the larger stand is preferable, but forsys can no longer see it. And the user has no idea that Patchmax/forsys is doing this.

codyevers commented 1 year ago

FROM ALAN ON JAN 7TH: Both spm and area adjustments should be left to a data pre-processing option that the user can do in a separate script that we provide or done with the field calculator in arc.  These two adjustments are not always needed depending on the data and they should not be imbedded in the code.  Its almost better if users don’t do these adjustments to understand the outputs in terms of the raw data.

For background the origins of the area adjustments and spm calculations are from the first BM study before any of you were working with forsys nor pedro.  We found that when the Wallowa Whitman mapped polygons they were generally larger in the north part of the forest, and since larger polygons when treated in general produced more volume forsys chose those stands.  So I  used volume per acre as the objective and then reported attainment as the total volume.  Reporting attainment on a per area basis is meaningless, so the objective and attainment needed to be 2 different metrics which at first seemed weird.

As for SPM, when we started using weighted objective functions we found certain combinations of variables could not be blended because one was scaled from 0-1 and the other was scaled from 0-100,000.  All of the solutions were weighted to the variable with the larger maximum values.  So I calculated the standardized proportion of the mean  to equalize the scale of the variables.

The important point is that these adjustments are not always required.  For instance raster data have equal area per cell so area adjustment is not needed.  Single objective runs do not need SPM, nor do multiobjective runs when the scaling of the variables is more or less the same. This is the case with the he accell data google is using where the data are raster and the objectives are standardized pillars so they might not need either of these adjustments and having them in the code just confuses their programmers.

Hopefully this can all get resolved so we can get the google people a clean version of forsys and then we can explain how to pre-process data when needed.

codyevers commented 1 year ago

My plan:

  1. Remove SPM & PCP calculation from inside ForSys.
  2. Make these external functions that can be called as needed and then fed to forsys::run. External functions will append _PCP and _SPM to variables names.
  3. Delete global_threshold and/or make external to forsys:run
  4. Delete stand_pcp_spm parameter from forsys::run
  5. User manually points to newly created SPM fields if they wish to use them to prioritize.
  6. User will need to manually include PCP fields in scenario_output_fields if they want these output.For my reference, the current process involving SPM & PCP is:
    • Assign scenario_priorities to stand_pcp_spm if latter is NULL
    • Calculate PCP & SPM for stand_pcp_spm fields & scenario_output_fields
    • Set weighted objective value based on scenario_priorites with SPM suffix
    • Tally all PCP fields in addition to scenario outputs
codyevers commented 1 year ago

I've removed spm and pcp calculations from inside forsys::run. These are now external functions (calculate_pcp and calculate_spm) that can be run on the stand data prior to running forsys::run. I've also removed the global thresholds parameter as this seems to be something that is part of data preparation, not the actual model run. We can return these back to forsys::run if desired.

michelledayusfs commented 1 year ago

I'm not sure we want to delete the global_threshold. Remember this is trying to match the binary availability in ForSysX. I get that we could include this in the thresholds (we had a long conversation with Robb about this), but the way that the PCP/SPM is calculated in ForSys is AFTER you have set your threshold (another issue with running SPM/PCP outside of the run function) so that it does not include, for example, wilderness in your treatment totals. For example, if the highest value timber volume is in wilderness, I don't want that included when SPM is calculated. The SPM value of 100 should be in an available stand.

michelledayusfs commented 1 year ago

Previously forsys run function would filter out the stands based on global_threshold, THEN calculate SPM/PCP. We may need both options, to 1) apply global threshold in calculating spm/pcp; but also 2) allow for global threshold in run function.

1. that has an outcome of forcing users to think about this, and specify the landbase explicitly

Build calculate_spm with parameter for availability field, like calculate_spm('timber_vol', 'available')

2. Keep global threshold in forsys run to help scenario planning language consistent

Do not filter out these available stands so patchmax has adjacency info.

codyevers commented 1 year ago

I've added an availability_txt parameter to both cacluate_spm and calculate_pcp`. If provided, those values are zeroed out and therefore don't contribute to how values are noramlized using either pcp or spm. Stands outside of threshold are retained. Here' an example using the test_forest data:

# calculate and append SPM and PCP values
test_forest <- forsys::test_forest %>% 
  calculate_spm(fields = c("priority1"), availability_txt = 'mosaic1 == 3') %>%
  calculate_pcp(fields = c("priority1","priority2"), availability_txt = 'mosaic1 == 3')

or alternatively...

# calculate and append SPM and PCP values
threshold_txt <- 'mosaic1 == 3'
test_forest <- forsys::test_forest %>% 
  calculate_spm(c("priority1"), threshold_txt) %>%
  calculate_pcp(c("priority1","priority2"), threshold_txt)
michelledayusfs commented 1 year ago

Does this only work for integers, or why else would this be failing?

shp <- forsys::shp %>% calculate_spm(fields = c("Am4RevBio", "prob_8p"), availability_txt = 'OwnerClass == USDA FOREST SERVICE') %>% calculate_pcp(fields = c("Am4RevBio", "prob_8p"), availability_txt = 'OwnerClass == USDA FOREST SERVICE')

ERROR: Error in paste0("stands %>% mutate(out = ifelse(", filter_txt, ", TRUE, FALSE)) %>% pull(out)") : object 'filter_txt' not found

Also fails if I try defining the "threshold_txt" as in option 2 above.

michelledayusfs commented 1 year ago

Hmm, I created a new integer field and it still didn't work:

Error in paste0("stands %>% mutate(out = ifelse(", filter_txt, ", TRUE, FALSE)) %>% pull(out)") : object 'filter_txt' not found

codyevers commented 1 year ago

Opps, the variable was mislabled. This is now fixed.

michelledayusfs commented 1 year ago

Okay this is working with one modification. forsys has to be associated with the specific function call and not the shapefile.

shp <- shp %>% forsys::calculate_spm(fields = c("Am4RevBio", "prob_8p"), availability_txt = 'OwnerClass == "USDA FOREST SERVICE"') %>% forsys::calculate_pcp(fields = c("Am4RevBio", "prob_8p"), availability_txt = 'OwnerClass == "USDA FOREST SERVICE"')

Note also that text fields need the quotes within the quotes (vs. integer).

michelledayusfs commented 1 year ago

I'll test one more thing before closing. I am going to run ForSys and manually call these SPMs/PCPs. And maybe a combo of being a neophyte user and see what happens if I don't understand the different parameter calls like output and priorities.

michelledayusfs commented 1 year ago

This functions as expected now. Variables need to be added manually to both scenario_priorities and scenario_output_fields.