Closed brownag closed 8 months ago
Added subrule names, along with reason class and rating. Format is `{SUBRULE} "{REASON}" ({RATING}); {SUBRULE} "{REASON}" ({RATING});". It will be consistent, but a pain to parse if you really need those values. Also the ordering can be inconsistent.
Could potentially add an optional argument to get_SDA_interpretation()
that post-processes all of the reason fields and "widens" the data.frame result accordingly, adding one column per subrule rating. Similar to example in #303
Added argument to get_SDA_interpretation()
called wide_reason
, default FALSE
. If TRUE
, this new function does some post-processing. It parses the string contents of the "reason_*"
fields from the result and adds a new column for each subrule rating within each main rule.
So, now you can quickly obtain ready-to-use subrule ratings for arbitrary interps, which should adequately cover needs from #303
library(soilDB)
x <- get_SDA_interpretation(rulename = c("NCCPI - National Commodity Crop Productivity Index (Ver 3.0)",
"AGR - Pesticide Loss Potential-Leaching",
"ENG - Local Roads and Streets"),
method = "Dominant Component", not_rated_value = "Not rated",
mukeys = c("242963","242964","242965"), wide_reason = TRUE)
x
#> mukey cokey areasymbol musym
#> 1 242963 23671045 IL019 152A
#> 2 242964 23670915 IL019 134A
#> 3 242965 23671016 IL019 154A
#> muname compname compkind comppct_r
#> 1 Drummer silty clay loam, 0 to 2 percent slopes Drummer Series 94
#> 2 Camden silt loam, 0 to 2 percent slopes Camden Series 92
#> 3 Flanagan silt loam, 0 to 2 percent slopes Flanagan Series 95
#> majcompflag rating_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 Yes 0.826
#> 2 Yes 0.917
#> 3 Yes 0.899
#> class_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 High inherent productivity
#> 2 High inherent productivity
#> 3 High inherent productivity
#> reason_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 Impacted soil "No limitation" (0); NCCPI - NCCPI Cotton Submodel (II) "Cotton" (0.001); NCCPI - NCCPI Small Grains Submodel (II) "Small grains" (0.687); NCCPI - NCCPI Soybeans Submodel (I) "Soybeans" (0.752); NCCPI - NCCPI Corn Submodel (I) "Corn" (0.826)
#> 2 NCCPI - NCCPI Cotton Submodel (II) "Cotton" (0.001); NCCPI - NCCPI Soybeans Submodel (I) "Soybeans" (0.777); NCCPI - NCCPI Small Grains Submodel (II) "Small grains" (0.791); NCCPI - NCCPI Corn Submodel (I) "Corn" (0.917); Impacted soil "No limitation" (0)
#> 3 Impacted soil "No limitation" (0); NCCPI - NCCPI Cotton Submodel (II) "Cotton" (0.001); NCCPI - NCCPI Small Grains Submodel (II) "Small grains" (0.734); NCCPI - NCCPI Soybeans Submodel (I) "Soybeans" (0.76); NCCPI - NCCPI Corn Submodel (I) "Corn" (0.899)
#> rating_AGRPesticideLossPotentialLeaching
#> 1 Not rated
#> 2 Not rated
#> 3 Not rated
#> class_AGRPesticideLossPotentialLeaching
#> 1 <NA>
#> 2 <NA>
#> 3 <NA>
#> reason_AGRPesticideLossPotentialLeaching rating_ENGLocalRoadsandStreets
#> 1 <NA> 1
#> 2 <NA> 1
#> 3 <NA> 1
#> class_ENGLocalRoadsandStreets
#> 1 Very limited
#> 2 Very limited
#> 3 Very limited
#> reason_ENGLocalRoadsandStreets
#> 1 Ponded > 4 hours "Ponding" (1); Wet, Ground Water Near the Surface (30 - 75cm) "Depth to saturated zone" (1); Potential Frost Action > Low "Frost action" (1); Strength (AASHTO Group Index Weighted Average (25-100cm)) "Low strength" (1); Shrink-Swell (LEP WTD_AVG 25-100cm or Bedrock) "Shrink-swell" (0.37)
#> 2 Potential Frost Action > Low "Frost action" (1); Strength (AASHTO Group Index Weighted Average (25-100cm)) "Low strength" (0.955); Shrink-Swell (LEP WTD_AVG 25-100cm or Bedrock) "Shrink-swell" (0.375)
#> 3 Strength (AASHTO Group Index Weighted Average (25-100cm)) "Low strength" (1); Shrink-Swell (LEP WTD_AVG 25-100cm or Bedrock) "Shrink-swell" (0.894); Wet, Ground Water Near the Surface (30 - 75cm) "Depth to saturated zone" (0.746); Potential Frost Action > Low "Frost action" (0.5)
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_Impactedsoil
#> 1 0
#> 2 0
#> 3 0
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPICottonSubmodelII
#> 1 0.001
#> 2 0.001
#> 3 0.001
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPISmallGrainsSubmodelII
#> 1 0.687
#> 2 0.791
#> 3 0.734
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPISoybeansSubmodelI
#> 1 0.752
#> 2 0.777
#> 3 0.76
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPICornSubmodelI
#> 1 0.826
#> 2 0.917
#> 3 0.899
#> rating_reason_AGRPesticideLossPotentialLeaching_Notrated
#> 1 NA
#> 2 NA
#> 3 NA
#> rating_reason_ENGLocalRoadsandStreets_Ponded4hours
#> 1 1
#> 2 <NA>
#> 3 <NA>
#> rating_reason_ENGLocalRoadsandStreets_WetGroundWaterNeartheSurface3075cm
#> 1 1
#> 2 <NA>
#> 3 0.746
#> rating_reason_ENGLocalRoadsandStreets_PotentialFrostActionLow
#> 1 1
#> 2 1
#> 3 0.5
#> rating_reason_ENGLocalRoadsandStreets_StrengthAASHTOGroupIndexWeightedAverage25100cm
#> 1 1
#> 2 0.955
#> 3 1
#> rating_reason_ENGLocalRoadsandStreets_ShrinkSwellLEPWTDAVG25100cmorBedrock
#> 1 0.37
#> 2 0.375
#> 3 0.894
A final consideration: soilDB:::.cleanRuleColumnName()
strips non-alphanumeric characters to make a "legal" R column name. It is possible this could lead to some collisions w/ certain subrule names...
For instance, inequalities are lost. "Ponded > 4 hours" and "Ponded < 4 hours" simplify to the same name "Ponded4hours". Could add a few things like replacing ">" "<" "=" with "GT" "LT" "EQ"...
It appears that collisions will be rare, and only for Texas subrules in FY24 SSURGO, but not impossible:
library(soilDB)
x <- SDA_query("SELECT DISTINCT rulename FROM cointerp")[[1]]
#> single result set, returning a data.frame
y <- soilDB:::.cleanRuleColumnName(x)
length(x)
#> [1] 3237
length(unique(y))
#> [1] 3231
xx <- c(x[duplicated(y)], x[duplicated(y, fromLast = TRUE)])
sort(xx)
#> [1] "AGR - Rutting Hazard =< 10,000 Pounds per Wheel (TX)"
#> [2] "AGR - Rutting Hazard > 10,000 Pounds per Wheel (TX)"
#> [3] "CaCO3 < 40% by Wght. Av. 0-40\" (TX)"
#> [4] "CaCO3 > 40% by Wght. Av. 0-40\" (TX)"
#> [5] "Excess Humus (FB, Peat, HM/MPT Surface Layer) (TX)"
#> [6] "Excess Humus (FB/Peat/HM/MPT Surface Layer) (TX)"
#> [7] "Flooding Occasional or greater; Duration Long,Very Long (TX)"
#> [8] "Flooding Occasional or greater; Duration Long/Very Long (TX)"
#> [9] "Ponding => Frequent (TX)"
#> [10] "Ponding Frequent (TX)"
#> [11] "Soil Strength (Rutting Vehicle =< 10,000 Pounds) (TX)"
#> [12] "Soil Strength (Rutting Vehicle > 10,000 Pounds) (TX)"
There was only one existing collision in mrulename
(as opposed to the few listed above for rulename
). However, the modification to add inequalities back in will add a few characters to several existing mrulename
which could be a small breaking change.
This is the list of affected interpretations that folks will need to update column name references for:
Adds subrule ratings to "reason" fields calculated for each
mrulename
in aget_SDA_interpretation()
query.This helps get key information about subrules that are exported in cointerp with
ruledepth > 0
TODO:
.interpretation_weighted_average()
needs SQLite compatibleSTRING_AGG()
switchorder subrule "reasons" alphabetically? or at least consistently(wontfix; can't use ORDER BY in the T-SQL subquery)Will close #303
Note the "reason" field now includes the
interphr
as well asinterphrc
values for rules with ruledepth != 0