ropensci / allodb

An R package for biomass estimation at extratropical forest plots.
https://docs.ropensci.org/allodb/
GNU General Public License v3.0
36 stars 11 forks source link

create function to predicted biomass against DBH, by species #73

Closed teixeirak closed 3 years ago

teixeirak commented 5 years ago

@maurolepore,

In order to check current calculations, and for future users to be able to visualize what allodb is giving in terms of predicted biomass, we'll want to make the following plot type: x-axis: DBH (cm) y-axis: biomass (kg) one colored line or series of points for each species at a site, spanning the range of sizes observed at the site

maurolepore commented 5 years ago

We still don't support generic equations. On one branch of the project I temporarily removed generic equations completely (discussion at https://github.com/forestgeo/allodb/issues/72).

Removing generic equations should produce more accurate biomass estimates for the species we have equations for, but it comes at the cost of loosing some species.

Compare (@teixeirak):

teixeirak commented 5 years ago

There are a lot of species that rely on generic equations. What do we need to be able to handle them? Please let me know if there are any barriers you face in order to incorporate them (other than just time to work on this).

maurolepore commented 5 years ago

allo_find() now automatically prefers expert equations but falls back to generic equations that match the exact same site and species. This change isn't yet in the master branch but will be merged soon. For now you can see it in action on this branch.

Notice that trees which species is given as Genus sp. can't find a matching species on allodb. We need to support that feature (https://github.com/forestgeo/fgeo.biomass/issues/30).

allo_find(census_species)
#> Assuming `dbh` in [mm] (required to find dbh-specific equations).
#> * Matching equations by site and species.
#> * Refining equations according to dbh.
#> * Using generic equations where expert equations can't be found.
#> Warning:   Can't find equations matching these species:
#>   acer sp, carya sp, crataegus sp, fraxinus sp, juniperus virginiana, quercus prinus, quercus sp, ulmus sp, unidentified unk
#> Warning: Can't find equations for 17132 rows (inserting `NA`).
teixeirak commented 5 years ago

Great!

There may be cases where “Genus sp.” would have a site-specific equation. That would be something for Erika to comment on / deal with. For now it will be fine for them to fall back on generic equations for any site (which of course remain to be entered, and I’d be happy to enter some examples if needed).

maurolepore commented 5 years ago

There may be cases where “Genus sp.” would have a site-specific equation.

I see some examples of this already.

library(tidyverse)
library(allodb)

master() %>% 
  select(site, species, equation_group) %>% 
  filter(equation_group == "Generic") %>% 
  filter(str_detect(species, " sp.$")) %>% 
  distinct()
#> Joining `equations` and `sitespecies` by 'equation_id'; then `sites_info` by 'site'.
#> # A tibble: 15 x 3
#>    site           species       equation_group
#>    <chr>          <chr>         <chr>         
#>  1 serc           Ligustrum sp. Generic       
#>  2 lilly dicky    Crataegus sp. Generic       
#>  3 serc           Vaccinium sp. Generic       
#>  4 tyson          Crataegus sp. Generic       
#>  5 umbc           Crataegus sp. Generic       
#>  6 harvard forest Crataegus sp. Generic       
#>  7 serc           Quercus sp.   Generic       
#>  8 yosemite       Salix sp.     Generic       
#>  9 wind river     Abies sp.     Generic       
#> 10 yosemite       Abies sp.     Generic       
#> 11 lilly dicky    Carya sp.     Generic       
#> 12 scbi           Carya sp.     Generic       
#> 13 serc           Carya sp.     Generic       
#> 14 tyson          Carya sp.     Generic       
#> 15 umbc           Carya sp.     Generic

... fall back on generic equations for any site ... I’d be happy to enter some examples if needed

Okay, with two examples I should have enogh to build the logic.

teixeirak commented 5 years ago

I will provide a couple examples soon.

teixeirak commented 5 years ago

Should I put the examples in this table? That is, do edits go to the .csv files in raw data, as opposed to the R tables? (Note that I just fixed a typo in that file, assuming that will make it into R tables.)

maurolepore commented 5 years ago

If by "example" we mean a mock, then you may simpliy give me the name of a couple of allometries and tell me which sites and taxa they should match. I can create a toy dataset to build some code around.

If instead by "example" we mean something that is production ready (i.e. not a toy but something that can be really used), then certainly the files you want to edit live in the directory data-raw/csv_database/ (https://github.com/forestgeo/allodb/blob/master/data-raw/csv_database/). In this case you'll need to enter new rows in the equations table, each row with an equation_id from this file (please read this short explanation), the taxa and corresponding allometry, and all other relevant information to complete the row. Also you will need to enter the exact same equation_id in the sitespecies table (because equation_id is the key linking the sitespecies with the equations table) along with the site and other relevant information to complete the row. Below is a short view of the most important columns in each of those two tables. In this case we can tag your commits to make sure that Erika can later review them.

RE

That is, do edits go to the .csv files in raw data, as opposed to the R tables?

That's right. The census tables are given by the users, and--after some wranging--matched with allodbh tables by the values in the columns "species", "site".

library(allodb)
library(tidyverse)

list(equations = equations, sitespecies = sitespecies) %>% 
  map(~ select(.x, matches("^site|^species|^equation|allometry$")))
#> $equations
#> # A tibble: 175 x 3
#>    equation_id equation_allometry                equation_form          
#>    <chr>       <chr>                             <chr>                  
#>  1 2060ea      10^(1.1891+1.419*(log10(dbh^2)))  10^(a+b*(log10(dbh^c)))
#>  2 a4d879      10^(1.2315+1.6376*(log10(dbh^2))) 10^(a+b*(log10(dbh^c)))
#>  3 c59e03      exp(7.217+1.514*log(dbh))         exp(a+b*log(dbh))      
#>  4 96c0af      10^(2.5368+1.3197*(log10(dbh)))   10^(a+b*(log10(dbh)))  
#>  5 529234      10^(2.0865+0.9449*(log10(dbh)))   10^(a+b*(log10(dbh)))  
#>  6 ae65ed      exp(-2.48+2.4835*log(dbh))        exp(a+b*log(dbh))      
#>  7 9c4cc9      10^(-1.326+2.762*(log10(dbh)))    10^(a+b*(log10(dbh)))  
#>  8 7f7777      exp(-2.5095+2.5437*log(dbh))      exp(a+b*log(dbh))      
#>  9 cf733d      exp(5.67+1.97*log(dbh))           exp(a+b*log(dbh))      
#> 10 f08fff      exp(-2.2118+2.4133*log(dbh))      exp(a+b*log(dbh))      
#> # ... with 165 more rows
#> 
#> $sitespecies
#> # A tibble: 772 x 6
#>    site    species    species_code equation_group equation_id equation_taxa
#>    <chr>   <chr>      <chr>        <chr>          <chr>       <chr>        
#>  1 Lilly ~ Acer rubr~ 316          Expert         7c72ed      Acer rubrum  
#>  2 Lilly ~ Acer rubr~ 316          Expert         2060ea      Acer rubrum  
#>  3 Lilly ~ Acer sacc~ 318          Expert         a4d879      Acer sacchar~
#>  4 Lilly ~ Amelanchi~ 356          Expert         c59e03      Amelanchier  
#>  5 Lilly ~ Amelanchi~ 356          Expert         96c0af      Amelanchier  
#>  6 Lilly ~ Amelanchi~ 356          Expert         529234      Amelanchier  
#>  7 Lilly ~ Asimina t~ 367          Generic        ae65ed      Mixed hardwo~
#>  8 Lilly ~ Carpinus ~ 391          Generic        ae65ed      Mixed hardwo~
#>  9 Lilly ~ Carya alba 409          Expert         9c4cc9      Carya        
#> 10 Lilly ~ Carya cor~ 402          Expert         9c4cc9      Carya        
#> # ... with 762 more rows

Created on 2019-03-26 by the reprex package (v0.2.1)

teixeirak commented 5 years ago

I added some (real) generic equation entries to the sitespecies table (this commit). These were equations that were already in the database, not totally new equations. For reasons relating to issue #85, I haven't checked whether they would represent the range of scenarios you may encounter, or even ever be called upon with the current set of ForestGEO sites. You can start with that and let me know if you need others.

maurolepore commented 5 years ago

Your commit is now tagged -- meaning that it's easy to find and revert

image

I'll follow up on this at https://github.com/forestgeo/allodb/issues/72

maurolepore commented 5 years ago

We now support shrubs. You can see the updated biomass analysis at http://bit.ly/demo-dbh-vs-biomass

After you comments I should be able to soon close https://github.com/forestgeo/allodb/issues/41 I'm only unsure about how to interpret dbh_min_cm when using shrub equations which independent variable is dba. My question is at https://github.com/forestgeo/allodb/issues/41#issuecomment-480003361

maurolepore commented 5 years ago

@teixeirak, and @gonzalezeb, FYI:

gonzalezeb commented 3 years ago

This issue is handle by the functions get_biomass and illustrate_allodb.