Closed schneiderpy closed 5 months ago
Thanks, @schneiderpy.
Are items a-c issues with theprice_relative()
and elemental_index()
functions, or just how they're shown in the example? Most functions in R are sensitive to the presence and position of the arguments; e.g.,
x = c(1:4, NA)
y = 5:1
all.equal(weighted.mean(x, y, na.rm = TRUE), weighted.mean(y, x, na.rm = TRUE))
#> [1] "'is.NA' value mismatch: 1 in current 0 in target"
y = mtcars
merge(y)
#> Error in merge.data.frame(y) : argument "y" is missing, with no default
Perhaps it would be clearer if I write the argument names in the function call. For example, in your example c it would then be clear that the business labels are taken as time periods and the time periods as elemental aggregates
elementals <- with(
ms_prices,
elemental_index(
relatives,
period = business,
ea = period,
na.rm = TRUE
)
)
or that there are no elemental aggregates supplied (so the default applies, grouping all price relatives into a single elemental aggregate as documented)
elementals <- with(
ms_prices,
elemental_index(
relatives,
period = period,
na.rm = TRUE
)
)
As for more defensive code, the price_relative()
and elemental_index()
functions each coerce their inputs to the requires types (as documented) and check the length of the vector inputs early in the body of the function.
price_relative(1:5, period = c(1, 1, 2, 2, 3), product = mtcars)
#> Error in xtfrm.data.frame(x) : cannot xtfrm data frames
elemental_index(1:5, period = c(1, 1, 2, 2))
#> Error in elemental_index.numeric(1:5, period = c(1, 1, 2, 2)) : input vectors must be the same length
(This is why your example a works. If ms_prices$period
were formatted like 2020-01
instead of 202001
it wouldn't work.) I'm happy to consider any suggestions for a different approach or if there are cases where the argument checking is insufficient.
Your suggestion to include head(relatives)
is good. I think it would be best to add it to the data frame to make it easier to see how it transforms prices.
ms_prices$relative <- with(ms_prices, price_relative(price, period, product))
head(subset(ms_prices, business == "B1"))
#> period business product price relative
#> 1 202001 B1 1 1.14 1.0000000
#> 2 202001 B1 2 NA NA
#> 3 202001 B1 3 6.09 1.0000000
#> 11 202002 B1 2 6.94 NA
#> 12 202002 B1 3 5.45 0.8949097
#> 20 202003 B1 2 2.32 0.3342939
# ... make elemental indexes
It's a bit trickier for head(pias)
. Although you're right that this would be helpful for larger aggregation structures, both head(x)
and tail(x)
should return an object of the same type as x
. Right now I have no subscripting method for aggregation structures as I'm not really sure what such a method should do (if anything). I usually do
print(pias, max = 12)
#> Aggregation structure for 5 elemental aggregates with 2 levels above the elemental aggregates
#> level1 level2 ea weight
#> 1 1 11 B1 553
#> 2 1 11 B2 646
#> 3 1 11 B3 312
[ reached 'max' / getOption("max.print") -- omitted 2 rows ]
to peek into an aggregation structure (just as for a data frame or matrix ). One could also do head(as.data.frame(pias))
to get back a data frame.
@marberts The unexpected results just concern the price_relative() and elemental_index() functions using your example in the paper
relative <- with(ms_prices, price_relative(price, period, product))
head(relative)
1 2 3 4 5 6
1 NA 1 1 1 1
relative2 <- with(ms_prices, price_relative(period, product, price))
head(relative2)
1.14 <NA> 6.09 6.23 8.61 6.4
1 NA NA NA NA NA
expect_true(expect_equal(relative, relative2))
Error: `relative` not equal to `relative2`.
Names: 'is.NA' value mismatch: 6 in current 0 in target
Both objects, "relative" and "relative2", are the computational results of price_relative(). Both objects might be subject of further computation in another function which leads to erronous results if the user does not respect the order/number of the attributes. My point is, that there is no error message popping up to alert the user of the missuse of the function.
relatives <- with(ms_prices, price_relative(price, product))
Error in price_relative(price, product) :
argument "product" is missing, with no default
.. argument "product" is not missing in this example (perhaps an error message saying " ... one argument is missing" would be more informative)
The same arguments (for defensive coding) hold for elemental_index()
One could also do head(as.data.frame(pias)) to get back a data frame.
Is it possible to change the class(pias) to data.frame or as.data.frame? That would probably include a gain in flexibility since the user can apply futher packages e.g. ggplot, knitr or flextable.
As part of the JOSS review
Thanks for the clarification @schneiderpy.
I'm not sure how I could validate the inputs to price_relative()
for your first point (but please suggest any ideas). I would need to know which columns in a table contain the correct variables for each of price_relative()
's arguments. The function makes sure that the first argument can be represented as a numeric and that the second two arguments can be represented as factors, but I'm not sure how I would check the content of these variables to determine if the inputs are valid or not.
As for with(ms_prices, price_relative(price, product))
, that error message comes from R, not me. Indeed, we can expand the call to see that the product
argument has not been supplied and that instead the values in the product
column of ms_prices
are used to denote time periods.
print_call = function(f) {
b = as.list(body(f))
body(f) = as.call(c(b[1], quote(print(match.call())), b[-1]))
f
}
price_relative = print_call(price_relative)
with(ms_prices, price_relative(price, product))
#> price_relative(x = price, period = product)
#> Error in price_relative(price, product): argument "product" is missing, with no default
It feels like you're expecting with()
to match column names to argument names like do.call()
. This brings me back to my suggestion to name the arguments to price_relative()
and elemental_index()
in the paper to make the calls to these functions explicit and not dependent on the order of the arguments. Another idea is to use a formula interface for non-standard evaluation instead of wrapping things in with()
. Elemental indexes could then be made like
elementals = elemental_index(
ms_prices,
price_relative(price, period, product) ~ period + business,
na.rm = TRUE
)
# Or, with pipes.
elementals = ms_prices |>
mutate(relative = price_relative(price, period = period, product = product)) |>
elemental_index(relative ~ period + product, na.rm = TRUE)
It would be easy enough to add this functionality, and I'm happy to do so if you think it would make the presentation easier.
(A third option would be to dispense with NSE and write out price_relative(ms_prices$price, period = ms_prices$period, product = ms_prices$product)
, but I find that a bit too verbose.)
As for your second point, changing the class of pias
would be tricky. It's not really a tabular object. There are methods to coerce to/from a data frame, but the current data structure I think makes the most sense.
One option is to keep the aggregation data as a table rather than explicitly construct the aggregation structure object. Aggregating with a data frame that represents an aggregation structure works just as well (as documented).
ms_weights[c("level1", "level2")] = expand_classification(ms_weights$classification)
pias = ms_weights[c("level1", "level2", "business", "weight")]
pias
#> level1 level2 business weight
#> 1 1 11 B1 553
#> 2 1 11 B2 646
#> 3 1 11 B3 312
#> 4 1 12 B4 622
#> 5 1 12 B5 330
ms_index = aggregate(elementals, pias, na.rm = TRUE)
Again, let me know if you think this would make the example in the paper clearer.
@marberts You can try and implement something simple to test for function arguments, e.g.
attributes(price_relative())
Error in price_relative() : argument "x" is missing, with no default
> attributes(price_relative(ms_prices$price))
Error in price_relative(ms_prices$price) :
argument "period" is missing, with no default
> attributes(price_relative(ms_prices$price, ms_prices$period))
Error in price_relative(ms_prices$price, ms_prices$period) :
argument "product" is missing, with no default
or, as you have already mentioned, something like this:
test <- function(a,b,c) {
defined <- ls()
passed <- names(as.list(match.call())[-1])
if (any(!defined %in% passed)) {
stop(paste("missing values for", paste(setdiff(defined, passed), collapse=", ")))
}
a+b+c
}
As part of the JOSS review
Thanks for the suggestion. I'm going to mark this issue as needing further thought. I've never contemplated changing the default behavior when function arguments are missing, so I need to look into it more.
I'm going to record a couple thoughts here so I don't forget.
tidyverse
seem to rely on the default (obviously the base functions do as well).dplyr::nth() # Style guide uses nth() as an example for error messages.
#> Error in dplyr::nth() : argument "x" is missing, with no default
purrr::accumulate()
#> Error in purrr::accumulate() : argument ".f" is missing, with no default
tidyr::replace_na()
#> Error in replace_na.default() : argument "data" is missing, with no default
# Default for S3 dispatch with no default.
dplyr::mutate()
#> Error in UseMethod("mutate") : no applicable method for 'mutate' applied to an object of class "NULL"
piar
).fixest::feols()
#> Error in fixest::feols() : You must provide the argument 'data' (currently it is missing).
test <- function(a, b, c = 1) {
defined <- ls()
passed <- names(as.list(match.call())[-1])
if (any(!defined %in% passed)) {
stop(paste("missing values for",paste(setdiff(defined, passed), collapse = ", ")))
}
a + b + c
}
test(1, 2)
#> Error in test(1, 2) : missing values for c
I decided to merge in some work I had adding NSE to elemental_index()
in 96012b2. It doesn't change the example in the paper, but I feel it can make the function a bit easier to use.
@marberts since argument order matters in price_relative() you should probably name your arguments, e.g.
price_relative(price = price, period = period, product = product)
However, I don't know if this works, since you are calling gpindex::back_period(period, product)
for the final step.
I do agree that ms_prices$relative
is probably uncessary.
head(pias) still doesn't work as expected
As part of the review
Thanks for the suggestions. I ended up re-writing the code snippets in the example to try and incorporate your comments and make them easier to read (the results are unchanged).
The period
and product
arguments are now named for price_relative()
. The first argument to price_relative()
is x
, not price
, so I leave it unnamed as two out of three named arguments means the call is unambiguous. (I use x
instead of price
so that I have the option to easily make this function generic in the future.) gpindex::back_period()
won't cause any issues here.
Instead of explicitly making an aggregation structure object, pias
is now a data frame. This means head(pias)
works as expected, as does any other function expecting a data frame. aggregate()
converts this data frame to an aggregation structure object in the background.
Lastly, I've removed the use of with()
in the example. Instead I use the data frame method for elemental_index()
and specify the inputs with a formula. This is similar to using tapply()
or lm()
. I think this is a bit easier to read, especially if there's no need to show ms_prices$relative
.
Hopefully this addresses your comments in this issue, but please let me know if there's still something outstanding.
Trying to replicate your examples in the re-written paper (not in the documentation which are still the same, however, in my opinion, should work with named function arguments)
The first example gives the following error:
elementals <- ms_prices |>
transform(
relative = price_relative(price, period = period, product = product)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
Error in elemental_index(as.numeric(x), ...) :
'list' object cannot be coerced to type 'double'
Now, I'll use part of the chunk to "test" for price_relative()
elementals <- ms_prices |>
transform(
relative = price_relative(price, period = period, product = product)
)
head(elementals)
period business product price relative
1 202001 B1 1 1.14 1
2 202001 B1 2 NA NA
3 202001 B1 3 6.09 1
4 202001 B2 4 6.23 1
5 202001 B2 5 8.61 1
6 202001 B2 6 6.40 1
.. and with different order of the function arguments
elementals2 <- ms_prices |>
transform(
relative = price_relative(price, product = product, period = period)
)
head(elementals2)
period business product price relative
1 202001 B1 1 1.14 1
2 202001 B1 2 NA NA
3 202001 B1 3 6.09 1
4 202001 B2 4 6.23 1
5 202001 B2 5 8.61 1
6 202001 B2 6 6.40 1
worked fine
Now, I'll use the example in the package documentation (and previous example in the paper which led us to this issue) (with correct order)
elementals3 <- with(
ms_prices,
elemental_index(
price_relative(price, period, product),
period, business, na.rm = TRUE
)
)
head(elementals3)
Period-over-period price index for 4 levels over 4 time periods
202001 202002 202003 202004
B1 1 0.8949097 0.3342939 NaN
B2 1 NaN NaN 2.770456
B3 1 2.0200036 1.6353355 0.537996
B4 NaN NaN NaN 4.576286
with changed order
elementals4 <- with(
ms_prices,
elemental_index(
price_relative(price, product, period),
period, business, na.rm = TRUE
)
)
head(elementals4)
Period-over-period price index for 4 levels over 4 time periods
202001 202002 202003 202004
B1 1.0000000 0.7853026 NaN NaN
B2 1.0166877 NaN 0.3633362 1.3831875
B3 0.9466638 0.6067904 2.1869572 0.8454208
B4 NaN NaN 0.2242169 1.7651714
It might be the case that you didn't change the function itself (in the script)
head(pias) now returns a data frame which might be useful if one might plot historical price index evolutions
Additional question: what do the first three rows indicate in chain(ms_index)
?
part of the review
...not in the documentation which are still the same, however, in my opinion, should work with named function arguments
You're right, that should change. (It seems I didn't resolve the conflicts when I merged that code into the joss
branch :disappointed: .) I've made these changes in 57addc7.
Which version of piar
are you using to get this error?
elementals <- ms_prices |>
transform(
relative = price_relative(price, period = period, product = product)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
#> Error in elemental_index(as.numeric(x), ...) :
#> 'list' object cannot be coerced to type 'double'
Version 0.8.0 doesn't have the functionality I added the other day. You'll need to use version 0.8.0.9005 which is in the joss
branch. You can install it with
# You'll need to have {pak} installed.
# install.packages("pak")
pak::pak("marberts/piar@joss")
I've not made any changes to the price_relative()
function. I simply updated the examples to name the arguments to this function and remove the use of with()
to make the function call clearer. Indeed,
elementals3 <- with(
ms_prices,
elemental_index(
price_relative(price, period = period, product = product),
period, business, na.rm = TRUE
)
)
elementals4 <- with(
ms_prices,
elemental_index(
price_relative(price, product = product, period = period),
period, business, na.rm = TRUE
)
)
all.equal(elementals3, elementals4)
#> [1] TRUE
#> Fixed-base price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.3827662 3.7815355
#> 11 1 1.3007239 1.3827662 2.1771866
#> 12 1 1.3007239 1.3827662 6.3279338
#> B1 1 0.8949097 0.2991629 0.4710366
#> B2 1 1.3007239 1.3827662 3.8308934
#> B3 1 2.0200036 3.3033836 1.7772072
#> B4 1 1.3007239 1.3827662 6.3279338
#> B5 1 1.3007239 1.3827662 6.3279338
The first three rows are the aggregated indexes and this is what aggregate(elementals, pias)
is computing. In this example businesses B1, B2, and B3 belong to industry 11, and B4 and B5 belong to industry 12. All five of these businesses belong to industry 1. So the index for industry 1 gives the change in price in this industry, which is constructed as the arithmetic index of the change in price for all five businesses. In practice this is usually a Lowe or Young index, but that depends on how the weights in pias
are made.
Thank you for the explication of the additional question. It might be a good idea to document this in the package documentation and paper.
Well, the unexpected function results still persist. Now tested with the new version
packageVersion("piar")
[1] ‘0.8.0.9005’
elementals <- ms_prices |>
transform(
relative = price_relative(price, period = period, product = product)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
head(elementals)
Period-over-period price index for 4 levels over 4 time periods
202001 202002 202003 202004
B1 1 0.8949097 0.3342939 NaN
B2 1 NaN NaN 2.770456
B3 1 2.0200036 1.6353355 0.537996
B4 NaN NaN NaN 4.576286
elementals2 <- ms_prices |>
transform(
relative = price_relative(price, product, period)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
head(elementals2)
Period-over-period price index for 4 levels over 4 time periods
202001 202002 202003 202004
B1 1.0000000 0.7853026 NaN NaN
B2 1.0166877 NaN 0.3633362 1.3831875
B3 0.9466638 0.6067904 2.1869572 0.8454208
B4 NaN NaN 0.2242169 1.7651714
all.equal(elementals, elementals2)
[1] "Component “index”: Component 1: Mean relative difference: 0.03501192"
[2] "Component “index”: Component 2: Mean relative difference: 0.5224239"
[3] "Component “index”: Component 3: 'is.NA' value mismatch: 1 in current 2 in target"
[4] "Component “index”: Component 4: Mean relative difference: 0.5714595"
And the "standalone" function price_relative()
relative1 <- price_relative(ms_prices$price, ms_prices$period, ms_prices$product)
relative2 <- price_relative(ms_prices$price, ms_prices$product, ms_prices$period)
all.equal(relative1, relative2)
[1] "Names: 40 string mismatches" "'is.NA' value mismatch: 12 in current 12 in target"
I still think that you should name the arguments in the function(s), since we have already discovered that argument order is important and avoids erroneous results if no error message is present.
as part of the review
It might be a good idea to document this in the package documentation and paper.
Good suggestion; I've made it clear in the paper and readme (bd5bc5b86829d773f9765b89f63e7773a1a6a202).
I must admit that I'm confused about the unexpected result with price_relative()
as I thought this was an issue about how the function is presented in the paper and documentation.
There's no ambiguity when calling price_relative()
if the period
and product
arguments are named, as you demonstrated in your comment yesterday, and I thought your comment was about naming these arguments so that it's clear what the function expects as an input. (Version 0.8.0.9005 should have much more consistent use of named arguments through the documentation to make the function calls clearer.)
Changing the order of the period
and product
argument when they're unnamed shouldn't give the same result because of how R matches arguments in a function call: named arguments are matched first and then the remaining arguments are matched in the order they appear. I'm not really sure what's unexpected here.
I still think that you should name the arguments in the function(s), since we have already discovered that argument order is important and avoids erroneous results if no error message is present.
Can you elaborate on this? Do you mean changing the signature of price_relative()
to be
price_relative = function(x, ..., period, product) {
...
}
so that period
and product
have to be named?
Well, maybe my English is worse than I thought ...
However, I'm surprised that you don't see the difference of the computational results of theprice_relative()
elementals2 <- ms_prices |>
transform(
relative = price_relative(price, period, product)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
> head(elementals2)
Period-over-period price index for 4 levels over 4 time periods
202001 202002 202003 202004
B1 1 0.8949097 0.3342939 NaN
B2 1 NaN NaN 2.770456
B3 1 2.0200036 1.6353355 0.537996
B4 NaN NaN NaN 4.576286
elementals3 <- ms_prices |>
transform(
relative = price_relative(price, product, period)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
> head(elementals3)
Period-over-period price index for 4 levels over 4 time periods
202001 202002 202003 202004
B1 1.0000000 0.7853026 NaN NaN
B2 1.0166877 NaN 0.3633362 1.3831875
B3 0.9466638 0.6067904 2.1869572 0.8454208
B4 NaN NaN 0.2242169 1.7651714
Changing the order of the period and product argument when they're unnamed shouldn't give the same result because of how R matches arguments in a function call: named arguments are matched first and then the remaining arguments are matched in the order they appear. I'm not really sure what's unexpected here.
Of course that is true and this is what an R programmer would expect.
However, I call elementals3 an unexpected result, since you would expect elementals2 . The point still is, that you don't have defensive coding built in (error/warning message) to alert the user to the misuse of your function. Moreover, the erroneous result of elementals3 is further used as input in other computations, which leads to erroneous results and conclusions. You should either a) include defensive coding with an error message b) change price_relative() to
price_relative = function(x, ..., period = period, product = product) {
...
}
which hopefully will solve the issue.
part of the JOSS review
Thanks for taking the time to explain the issue again, @schneiderpy. I think I understand now.
The issue isn't that price_relative(price, period, product)
should equal price_relative(price, product, period)
, but rather that it is easy to inadvertently swap the period
and product
arguments in price_relative()
(for which the function gives no warning) and this can ruin all subsequent calculations.
I'm not sure there is a way to correctly check the period
and product
arguments. Anything that can be coerced to a factor is acceptable for either argument, so I don't think it's possible to raise a warning if they've been reversed. (But please submit a PR if you have a method.) I think the way forward is to add the dot-dot-dots so that period
and product
need to be named. This is also a good opportunity to make price_relative()
generic.
While we've been discussing price_relative()
, there are a few more functions that would benefit from this approach.
Please let me know if I understand the issue correctly, and I'll get started making the necessary changes.
The issue isn't that price_relative(price, period, product) should equal price_relative(price, product, period), but rather that it is easy to inadvertently swap the period and product arguments in price_relative() (for which the function gives no warning) and this can ruin all subsequent calculations.
That is exactly what is was trying to say, however, swaped function arguments should give an error not warning, since everything subsequent (result) is not worthwile ... However, it is up to you, if you would like to give the user the possibility to swap the function arguments and you solve this internally in you function (or in the function that will do the final computation) or if you will restrict the input to the correct order of the arguments and error if otherwise.
Of course you can try something like price_relative(x, ...)
, however, I'm not sure that ellipses argument will change anything. They are usually used to give the possibility to add more arguments.
Question: Have you tryied to implement the named argument as mentioned in the previous commit?
While we've been discussing price_relative(), there are a few more functions that would benefit from this approach.
That is true, but I don't wanted to scare you ;)
As part of the JOSS review
I've updated price_relative()
in 9abb448 so that period
and product
need to be named and cannot be passed to the function by position. (This is on the nse
branch, not the joss
branch yet,)
elementals2 <- ms_prices |>
transform(
relative = price_relative(price, period = period, product = product)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
elementals3 <- ms_prices |>
transform(
relative = price_relative(price, product = product, period = period)
) |>
elemental_index(relative ~ period + business, na.rm = TRUE)
all.equal(elementals2, elementals3)
#> [1] TRUE
Trying to use the arguments unnamed is an error.
ms_prices |>
transform(
relative = price_relative(price, product, period)
)
#> Error in price_relative.default(price, product, period) : argument "period" is missing, with no default
I don't believe there is anything else I can do on this issue. There is no way to further validate the period
and product
arguments to give an error if they're swapped. (Is price_relative(1:4, c(1, 2, 1, 2), c(3, 3, 4, 4))
making price relatives for products 1 and 2 in periods 3 and 4, or products 3 and 4 in periods 1 and 2? Either interpretation could be right.) Adding the dot-dot-dots is the only way I know of to force arguments to be named.
I think the carry_forward()
, carry_backwards()
, shadow_price()
, and elemental_index()
are the only other functions that need to be updated in the same way as price_relative()
.
Please let me know if these changes address your comment and are sufficient to close this issue, as it will be a bit of work to implement. They also introduce non-backwards compatible changes that will break at least some code that I know is in use.
Perhaps you can try something like this:
price_relative <- function(x, ...)
UseMethod("price_relative")
# you have the above part already
price_relative.default <- function(x, ...) {
# extract arguments
a <- list(...)
# define correct order for parameters
params <- c("price", "product") # or as.factor
# check that all mandatory parameters are specified
stopifnot(all(params %in% names(a)))
# check order of mandatory parameters
stopifnot(identical(names(a)[names(a) %in% params], params))
# do calculations
res <- x / x[gpindex::back_period(period, product)]
names(res) <- as.character(product)
res
}
Perhaps you want to change stopifnot with assertthat_ for more informative error messages
As part of the JOSS review
I tried your suggestion but it doesn't pass R CMD check
because the signature of the function doesn't have arguments period
and product
but the documentation does. These arguments clearly need to be documented because the function still expects them.
❯ checking Rd \usage sections ... WARNING
Documented arguments not in \usage in Rd file 'price_relative.Rd':
‘period’ ‘product’
Functions with \usage entries need to have the appropriate \alias
entries, and all their arguments documented.
The \usage entries must correspond to syntactically valid R code.
See chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.
Adding these arguments to price_relative()
makes the extra checks redundant as omitting them will signal an error (e.g., argument "period" is missing, with no default) and there is no need to check the order because they must be passed by name, not position.
The check also raises a note about no visible bindings, but this can be fixed.
❯ checking R code for possible problems ... NOTE
price_relative.default: no visible binding for global variable ‘period’
price_relative.default: no visible binding for global variable
‘product’
Undefined global functions or variables:
period product
Well, you are right, documenting ellipses is difficult ..
Have you tried something like
price_relative <- function(x, price = price, product = product, ...)
?
The problem is to include the mandatory arguments price and product and to control for the order of both.
Since price_relative()
is "just" evaluating function input, while gpindex::back_period()
is computing the final result, what about spliting the task, say, price_relative()
evaluates the input, while gpindex::back_period()
(or gpindex::offset_period()
) would control for the right order of the arguments? Would that be feasable?
As part of the JOSS review
Adding the default arguments like you suggest makes a circular reference and gives a weird error if an argument is missing.
# Add default arguments.
price_relative <- function(x, period = period, product = product, ...) {
x <- as.numeric(x)
period <- as.factor(period)
product <- as.factor(product)
if (different_length(x, period, product)) {
stop("'x', 'period', and 'product' must be the same length")
}
res <- x / x[gpindex::back_period(period, product)]
names(res) <- as.character(product)
res
}
price_relative(ms_prices$price, ms_prices$period)
#> Error in is.factor(x) :
#> promise already under evaluation: recursive default argument reference or earlier problems?
These could be renamed to something else, but then the function can behave in weird ways a use variables in the global environment as defaults. If both period
and product
are supplied then the function behaves the same as if there were no defaults. Your original point about inadvertently swapping these arguments in the function call still applies.
For your second question, there's no way to make gpindex::back_period()
enforce how price_relative()
matches arguments. The arguments passed to this function are evaluated in the local scope of the price_relative()
function, after they are matched in the function call.
Let's go back to the first idea, trying to implente
price_relative <- function(x, ...)
UseMethod("price_relative")
# you have the above part already
price_relative.default <- function(x, ...) {
# extract arguments
a <- list(...)
# define correct order for parameters
params <- c("price", "product") # or as.factor
# check that all mandatory parameters are specified
stopifnot(all(params %in% names(a)))
# check order of mandatory parameters
stopifnot(identical(names(a)[names(a) %in% params], params))
# do calculations
res <- x / x[gpindex::back_period(period, product)]
names(res) <- as.character(product)
res
}
To pass CRAN check you could document the mandatory arguments within @param ...
or in a new section called @section Additional parameters:
As part of the JOSS review
Thanks for the suggestion, but I'm not a fan of this idea for a couple reasons.
...
to accommodate two arguments or adding a custom section for parameters is not idiomatic and likely to be confusing. It feels like a hack to get around R CMD check
.price_relative(x, period = a, product = b)
and price_relative(x, product = b, period = a)
should both work. There is no scope to accidentally swap the period
and product
arguments when they're named. It's only a problem when passing arguments by position, i.e., the difference between price_relative(x, a, b)
and price_relative(x, b, a)
requires the user to remember that the second argument in period
and the third argument is product
.I understand your comment here: the period
and product
arguments are sufficiently similar that it is easy to swap them in a call to price_relative()
without raising an error. Unless you're looking at the documentation for the function, this may be an easy mistake to make. I think that's a fair comment, which is why I updated the examples in the paper/readme/help files to use named arguments to make it clear how the function works.
If a fix to the documentation is not sufficient, the only resolution I can see for this issues is to disable matching the period
and product
arguments by position in price_relative()
. The idiomatic way to disable matching arguments by position in R is to force these arguments to be named by having ...
appear before them in the function's list of formal arguments. I already do this with a number of generic functions, so it wouldn't be out of place (and would be a good opportunity to make price_relative()
generic). I realize there are well-known reasons why the ...
can be problematic in R, and I'd prefer to use something like Python's keyword only arguments, but that's just how R works. I'm happy to make this change if it will address your concern and close the issue.
Right now I feel that we're just going back and forth and not really agreeing with the way R matches arguments in a function call. I don't think trying to change the semantics of the language is a way forward; this will just make the function harder to use.
In any case, let me know if making period
and product
"keyword only" for price_relative()
addresses your comment. If so, I can make the change for this function and the others I listed above with similar arguments. If not, while I appreciate your comments and the discussion we've had on this topic, I don't think there's any further changes I can make based on this issue.
My suggested solution is just an idea. Any solution that resolves this isssue is welcome
However, I do not agree in the case where the functionality of a function results in erronous results, without adverting the user, and, furthermore, are used in subsequent computations.
I've updated price_relative()
in d4ba0a5 to require period
and product
to be named. This means that the order of these arguments doesn't matter, so they can't be easily mixed up, and the default error messages are clearer.
all.equal(
with(ms_prices, price_relative(price, period = period, product = product)),
with(ms_prices, price_relative(price, product = product, period = period))
)
#> [1] TRUE
with(ms_prices, price_relative(price, period = period))
#> Error in price_relative.default(price, period = period) : argument "product" is missing, with no default
with(ms_prices, price_relative(price, product = product))
#> Error in price_relative.default(price, product = product) : argument "period" is missing, with no default
I've done the same thing for all the other functions with similar arguments, namely elemental_index()
, carry_forward()
, carry_backward()
, and shadow_price()
. These functions are also all generic now, and having these argument be named will make it easier to extend these functions in the future.
@marberts Following the example in you JOSS paper I have encountered some unexpect results:
Example: a) relatives <- with(ms_prices, price_relative(price, period, product)) gives the result as shown in the paper. It would be nice to add head(relatives) in the paper, so that the user can see what he should expect. However, if I change the order of the attributes I get the following result:
b) uninformative error msg
c)
still gives correct computation, however, in a different form However:
I suggest more defensive code if input of attributes and order of attributes are important for correct computations.
d) head() and tail() functions for your object "pias" (see paper) doesn't result in a dataframe. Probably, because it is not of class dataframe.
This functionality would be nice if one has larger data sets.
As part of JOSS review