Closed teixeirak closed 3 years ago
We still don't support generic equations. On one branch of the project I temporarily removed generic equations completely (discussion at https://github.com/forestgeo/allodb/issues/72).
Removing generic equations should produce more accurate biomass
estimates for the species we have equations for, but it comes at the cost of loosing some species.
Compare (@teixeirak):
There are a lot of species that rely on generic equations. What do we need to be able to handle them? Please let me know if there are any barriers you face in order to incorporate them (other than just time to work on this).
allo_find()
now automatically prefers expert equations but falls back to generic equations that match the exact same site and species. This change isn't yet in the master branch but will be merged soon. For now you can see it in action on this branch.
Notice that trees which species is given as Genus sp. can't find a matching species on allodb. We need to support that feature (https://github.com/forestgeo/fgeo.biomass/issues/30).
allo_find(census_species)
#> Assuming `dbh` in [mm] (required to find dbh-specific equations).
#> * Matching equations by site and species.
#> * Refining equations according to dbh.
#> * Using generic equations where expert equations can't be found.
#> Warning: Can't find equations matching these species:
#> acer sp, carya sp, crataegus sp, fraxinus sp, juniperus virginiana, quercus prinus, quercus sp, ulmus sp, unidentified unk
#> Warning: Can't find equations for 17132 rows (inserting `NA`).
Great!
There may be cases where “Genus sp.” would have a site-specific equation. That would be something for Erika to comment on / deal with. For now it will be fine for them to fall back on generic equations for any site (which of course remain to be entered, and I’d be happy to enter some examples if needed).
There may be cases where “Genus sp.” would have a site-specific equation.
I see some examples of this already.
library(tidyverse)
library(allodb)
master() %>%
select(site, species, equation_group) %>%
filter(equation_group == "Generic") %>%
filter(str_detect(species, " sp.$")) %>%
distinct()
#> Joining `equations` and `sitespecies` by 'equation_id'; then `sites_info` by 'site'.
#> # A tibble: 15 x 3
#> site species equation_group
#> <chr> <chr> <chr>
#> 1 serc Ligustrum sp. Generic
#> 2 lilly dicky Crataegus sp. Generic
#> 3 serc Vaccinium sp. Generic
#> 4 tyson Crataegus sp. Generic
#> 5 umbc Crataegus sp. Generic
#> 6 harvard forest Crataegus sp. Generic
#> 7 serc Quercus sp. Generic
#> 8 yosemite Salix sp. Generic
#> 9 wind river Abies sp. Generic
#> 10 yosemite Abies sp. Generic
#> 11 lilly dicky Carya sp. Generic
#> 12 scbi Carya sp. Generic
#> 13 serc Carya sp. Generic
#> 14 tyson Carya sp. Generic
#> 15 umbc Carya sp. Generic
... fall back on generic equations for any site ... I’d be happy to enter some examples if needed
Okay, with two examples I should have enogh to build the logic.
I will provide a couple examples soon.
Should I put the examples in this table? That is, do edits go to the .csv files in raw data, as opposed to the R tables? (Note that I just fixed a typo in that file, assuming that will make it into R tables.)
If by "example" we mean a mock, then you may simpliy give me the name of a couple of allometries and tell me which sites and taxa they should match. I can create a toy dataset to build some code around.
If instead by "example" we mean something that is production ready (i.e. not a toy but something that can be really used), then certainly the files you want to edit live in the directory data-raw/csv_database/ (https://github.com/forestgeo/allodb/blob/master/data-raw/csv_database/). In this case you'll need to enter new rows in the equations
table, each row with an equation_id
from this file (please read this short explanation), the taxa and corresponding allometry, and all other relevant information to complete the row. Also you will need to enter the exact same equation_id
in the sitespecies
table (because equation_id
is the key linking the sitespecies
with the equations
table) along with the site and other relevant information to complete the row. Below is a short view of the most important columns in each of those two tables. In this case we can tag your commits to make sure that Erika can later review them.
RE
That is, do edits go to the .csv files in raw data, as opposed to the R tables?
That's right. The census tables are given by the users, and--after some wranging--matched with allodbh tables by the values in the columns "species", "site".
library(allodb)
library(tidyverse)
list(equations = equations, sitespecies = sitespecies) %>%
map(~ select(.x, matches("^site|^species|^equation|allometry$")))
#> $equations
#> # A tibble: 175 x 3
#> equation_id equation_allometry equation_form
#> <chr> <chr> <chr>
#> 1 2060ea 10^(1.1891+1.419*(log10(dbh^2))) 10^(a+b*(log10(dbh^c)))
#> 2 a4d879 10^(1.2315+1.6376*(log10(dbh^2))) 10^(a+b*(log10(dbh^c)))
#> 3 c59e03 exp(7.217+1.514*log(dbh)) exp(a+b*log(dbh))
#> 4 96c0af 10^(2.5368+1.3197*(log10(dbh))) 10^(a+b*(log10(dbh)))
#> 5 529234 10^(2.0865+0.9449*(log10(dbh))) 10^(a+b*(log10(dbh)))
#> 6 ae65ed exp(-2.48+2.4835*log(dbh)) exp(a+b*log(dbh))
#> 7 9c4cc9 10^(-1.326+2.762*(log10(dbh))) 10^(a+b*(log10(dbh)))
#> 8 7f7777 exp(-2.5095+2.5437*log(dbh)) exp(a+b*log(dbh))
#> 9 cf733d exp(5.67+1.97*log(dbh)) exp(a+b*log(dbh))
#> 10 f08fff exp(-2.2118+2.4133*log(dbh)) exp(a+b*log(dbh))
#> # ... with 165 more rows
#>
#> $sitespecies
#> # A tibble: 772 x 6
#> site species species_code equation_group equation_id equation_taxa
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Lilly ~ Acer rubr~ 316 Expert 7c72ed Acer rubrum
#> 2 Lilly ~ Acer rubr~ 316 Expert 2060ea Acer rubrum
#> 3 Lilly ~ Acer sacc~ 318 Expert a4d879 Acer sacchar~
#> 4 Lilly ~ Amelanchi~ 356 Expert c59e03 Amelanchier
#> 5 Lilly ~ Amelanchi~ 356 Expert 96c0af Amelanchier
#> 6 Lilly ~ Amelanchi~ 356 Expert 529234 Amelanchier
#> 7 Lilly ~ Asimina t~ 367 Generic ae65ed Mixed hardwo~
#> 8 Lilly ~ Carpinus ~ 391 Generic ae65ed Mixed hardwo~
#> 9 Lilly ~ Carya alba 409 Expert 9c4cc9 Carya
#> 10 Lilly ~ Carya cor~ 402 Expert 9c4cc9 Carya
#> # ... with 762 more rows
Created on 2019-03-26 by the reprex package (v0.2.1)
I added some (real) generic equation entries to the sitespecies table (this commit). These were equations that were already in the database, not totally new equations. For reasons relating to issue #85, I haven't checked whether they would represent the range of scenarios you may encounter, or even ever be called upon with the current set of ForestGEO sites. You can start with that and let me know if you need others.
Your commit is now tagged -- meaning that it's easy to find and revert
I'll follow up on this at https://github.com/forestgeo/allodb/issues/72
We now support shrubs. You can see the updated biomass analysis at http://bit.ly/demo-dbh-vs-biomass
After you comments I should be able to soon close https://github.com/forestgeo/allodb/issues/41
I'm only unsure about how to interpret dbh_min_cm
when using shrub equations which independent variable is dba
. My question is at https://github.com/forestgeo/allodb/issues/41#issuecomment-480003361
@teixeirak, and @gonzalezeb, FYI:
dbh
versus biomass
is now on READMEThis issue is handle by the functions get_biomass and illustrate_allodb.
@maurolepore,
In order to check current calculations, and for future users to be able to visualize what allodb is giving in terms of predicted biomass, we'll want to make the following plot type: x-axis: DBH (cm) y-axis: biomass (kg) one colored line or series of points for each species at a site, spanning the range of sizes observed at the site