ncss-tech / aqp

Algorithms for Quantitative Pedology
http://ncss-tech.github.io/aqp/
54 stars 14 forks source link

Deprecate 'guessing' functions for horizon-level column names #309

Closed brownag closed 4 months ago

brownag commented 8 months ago

RE: https://github.com/ncss-tech/aqp/pull/306#discussion_r1501714651

This PR deprecates guessHzDesgnName() guessHzTexClName() and removes guessHzAttrName() from usage in aqp functions. I think the latter function is useful for certain applications if users "opt in" by using it in their own code but we can certainly avoid "internal" use of it.

The mechanism to require metadata be set, and generic methods for getting/setting site or horizon level metadata have existed for some time (https://github.com/ncss-tech/aqp/blob/fcad3e8e434b71e6ec61d59e032944a8dc09c24c/R/SoilProfileCollection-metadata.R#L3-L59). These are now used in all cases that used to use the guessXX functions. There are still several cases that could make use of required metadata routines.

To ease the handling of generic metadata column names, hzmetaname() and hzmetaname<- have been added. It allows the specification/storage of standard, but custom, metadata that relates a name for horizon data of a particular type. This replaces the use of guessHzAttrName(p, attr = 'clay', c("total", "_r")) in argillic/PSCS function definitions.

For example:

library(aqp)
#> This is aqp 2.0.3

data(sp1)

# promote to SPC
depths(sp1) <- id ~ top + bottom

# set important metadata columns
hzdesgnname(sp1) <- "name"
hztexclname(sp1) <- "texture"

# set custom horizon property (clay content) column
hzmetaname(sp1, "clay") <- "prop"

# inspect metadata list
metadata(sp1)
#> $aqp_df_class
#> [1] "data.frame"
#> 
#> $aqp_group_by
#> [1] ""
#> 
#> $aqp_hzdesgn
#> [1] "name"
#> 
#> $aqp_hztexcl
#> [1] "texture"
#> 
#> $depth_units
#> [1] "cm"
#> 
#> $stringsAsFactors
#> [1] FALSE
#> 
#> $original.order
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#> [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
#> [51] 51 52 53 54 55 56 57 58 59 60
#> 
#> $aqp_hzclay
#> [1] "prop"

# get horizon clay content column
hzmetaname(sp1, "clay")
#> [1] "prop"

# uses hzdesgname(), hztexclname(), hzmetaname(attr="clay") in function definition
estimatePSCS(sp1)
#>     id pscs_top pscs_bottom
#> 1 P001       49          89
#> 2 P002       30          59
#> 3 P003        2          52
#> 4 P004       32          62
#> 5 P005        5          55
#> 6 P006       31         106
#> 7 P007       25         100
#> 8 P008       27         102
#> 9 P009       28         103

One area where guessing is still used is guessGenHzLevels(). I will not address that in this PR, but out of all of these functions this is actually the only one I am aware of having caused confusion. I have had folks ask me about why the levels returned by generalize.hz() were ordered the way they were, and perhaps we should move this logic outside of the generalization function. Ordered factors make sense when the SPC is carefully curated, but can wind up being a bit illogical when there is, for example, a wide range of depth classes with different types of vertical subdivision present.

dylanbeaudette commented 8 months ago

Thanks. I did not mean to trigger an all at once change, but I do think that magical function action is best avoided. I'll think about how to best address guessGenHzLevels()--thanks for the reminder.

brownag commented 8 months ago

magical function action is best avoided.

IMO it is not that magical if we have exported, documented functions as default arguments to do the operation vs. having it buried inside a function, or having custom handling within the function like plotSPC() did. I agree there were a few variants of approaches, which was confusing and inconsistent. We still have some variation for functions that can take data.frame or SPC as input, but I think this PR is a step in the right direction for the future.

It is better overall to guide the users towards using the metadata functions / setting the metadata, and ensuring that soilDB functions also set relevant metadata so most users don't have to worry about it.

brownag commented 7 months ago

I would suggest merging this after CRAN release of 2.0.3

brownag commented 4 months ago

I think this is ready to merge

brownag commented 4 months ago

Thanks for review, just synced with master branch, assuming everything passes (aside pending fixes from #317) will proceed with merge