jeff-hughes / reghelper

R package with regression helper functions
5 stars 5 forks source link

simple_slopes raises error when variable names are contained in one another #16

Closed haranse closed 1 year ago

haranse commented 2 years ago

First of all, thanks for this super useful package!

When using variable names which are contained in one another in a formula, simple_slopes raises an error. Here is example code for this:

library(reghelper)

N = 10 data <- data.frame(var1=rnorm(N), var2=rnorm(N), var3=rnorm(N), var11=rnorm(N))

model <- lm(var3 ~ var1*var2 + var11, data=data)

simple_slopes(model)

And the error:

Error in str2lang(x) : :1:56: unexpected numeric constant 1: var3 ~ I(var1 - -0.994525) * var2 + I(var1 - -0.994525)1

I'm pretty sure that the problem is that simple_slopes uses gsub to replace the variable name with the "I" function, I think that replacing the line

new_form <- gsub(vname, new_var, new_form)

in each method with

newform <- gsub(paste0("((?<=[^a-zA-Z0-9.])|^)",vname,"(?=([^a-zA-Z0-9._]|$))"),new_var,new_form,perl=TRUE)

would solve it - the regexp makes sure that on each side of the variable name there's the edge of the string or a character that can't be part of a variable name) ; this would only fail for extremely weird variable names, e.g. that use `` to include spaces in the variable name.

jeff-hughes commented 2 years ago

Thanks for noting this issue. That is indeed an issue with the regex -- I'm surprised I've never run into the issue before. Thanks for proposing a solution! I'll do some testing with it and hopefully push a fix soon.