PecanProject / pecan

The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.
www.pecanproject.org
Other
202 stars 230 forks source link

Add script that extracts Soilgrids SOC based on location of sites #3040

Closed Qianyuxuan closed 1 year ago

Qianyuxuan commented 1 year ago

Description

Add a script that extracts total soil organic carbon and its standard deviation based on user-defined site location from SoilGrids250m version 2.0: https://soilgrids.org

Motivation and Context

To provide SOC observations for SDA

Review Time Estimate

Types of changes

Checklist:

Qianyuxuan commented 1 year ago

Thanks Chris @infotroph ! Great suggestion! I did test and found out "dplyr::pivot_longer" works well to replace "reshape2::melt". I will update my PR soon.

On Sep 29, 2022, at 9:10 PM, Chris Black @.***> wrote:

@infotroph commented on this pull request.

In modules/data.land/R/soilgrids_soc_extraction.R https://github.com/PecanProject/pecan/pull/3040#discussion_r984191186:

  • rownames(ocdquant) <- depths
  • colnames(ocdquant) <- c(rep("Mean", length(internal_site_info$site_info.lon)),
  • rep("0.05", length(internal_site_info$site_info.lon)),
  • rep("0.5", length(internal_site_info$site_info.lon)),
  • rep("0.95", length(internal_site_info$site_info.lon)))
  • rownames(siteid) <- depths
  • colnames(siteid) <- c(rep("Mean", length(internal_site_info$site_info.lon)),
  • rep("0.05", length(internal_site_info$site_info.lon)),
  • rep("0.5", length(internal_site_info$site_info.lon)),
  • rep("0.95", length(internal_site_info$site_info.lon)))
  • if (verbose) {
  • close(pb)
  • }
  • parse extracted data and prepare for output

  • ocd_fit <- reshape2::melt(ocdquant, id.vars = c("Mean")) Low priority: How difficult would it be to rewrite these manipulations to use something like dplyr::pivot_longer instead of adding reshape2 to the dependency list? data.land has a beastly number of dependencies already, so when there's a choice I recommend preferring solutions that use existing imports rather than add new ones.

— Reply to this email directly, view it on GitHub https://github.com/PecanProject/pecan/pull/3040#pullrequestreview-1126224337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCCZ73TGII3OLAXHKGWLI3WAZR4DANCNFSM6AAAAAAQNCFL34. You are receiving this because you were mentioned.

serbinsh commented 1 year ago

Looks like reshape2 is indeed no longer a depends

[sserbin@modex data.land]$ grep -ri "reshape" .
./inst/FIA_allometry/README.Rmd:# Note: the "reshape" package is currently required, although this may
./inst/FIA_allometry/README.Rmd:library(reshape)
./inst/FIA_allometry/allometry.r:# Note: the "reshape" package is currently required, although this may
./inst/FIA_allometry/allometry.r:  #install.packages("reshape")
./inst/FIA_allometry/allometry.r:library(reshape)
./inst/FIA_allometry/biomass.r:# Note: the "reshape" package is currently required, although this may
./inst/FIA_allometry/biomass.r:library(reshape)
./DESCRIPTION:    reshape2,

but we still have some reshape requirements that could be updated at some point

mdietze commented 1 year ago

@Qianyuxuan here are the most recent build errors.

 checking package dependencies ... NOTE: Imports includes 33 non-default packages.
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract: no visible binding for global variable 'name'
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract : cgamma: no visible global function definition for 'qgamma'
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract : fitQ: no visible global function definition for 'optim'
serbinsh commented 1 year ago

@Qianyuxuan here are the most recent build errors.

 checking package dependencies ... NOTE: Imports includes 33 non-default packages.
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract: no visible binding for global variable 'name'
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract : cgamma: no visible global function definition for 'qgamma'
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract : fitQ: no visible global function definition for 'optim'

@Qianyuxuan here are the most recent build errors.

 checking package dependencies ... NOTE: Imports includes 33 non-default packages.
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract: no visible binding for global variable 'name'
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract : cgamma: no visible global function definition for 'qgamma'
  checking R code for possible problems ... NOTE: soilgrids_soilC_extract : fitQ: no visible global function definition for 'optim'

So "qgamma" is an R base function yes, same with optim yes (well its stats::optim). Not sure what "name" is referring too

serbinsh commented 1 year ago

@mdietze in the current errors, I dont understand what check is referring to with

NOTE: soilgrids_soilC_extract: no visible binding for global variable 'name'

All I could find was in the docs

' @name soilgrids_soilC_extract

Is that what its referring too?

infotroph commented 1 year ago

Thanks! When I made the suggestion I may possibly have been forgetting that pivot_longer lived in tidyr instead of dplyr and therefore still meant adding a new dependency, but I like this change anyway :)