The data.table internals of alpaca don't seem to behave nicely with formula elements that generate more than one resulting column. For example, if you use bs() from the splines package or poly() from the base stats module, both provide clean notation for a flexible polynomial function of one (or more, in the case of poly) variables. However, feglm seems to rely on data.table to pick these column expansions up correctly, and that does not yet work. For example, in a dataset where I have an Acres variable that I'd like to get a 2nd order orthogonal polynomial expansion in, this fails:
> m <- feglm(DBOEPerAcre ~ Auction + poly(Acres, degree = 2) + Term + RoyaltyRate | Grid20Yr + YearQtr | Grid20, reg_data, family = poisson())
Error in `[.data.table`(data, , `:=`((tmp.var), mean(get(lhs))), by = eval(i)) :
Column 3 ['poly(Acres, degree = 2)'] is length 2218 but column 1 is length 1109; malformed data.table.
As you can see, the dataset has 1109 rows, and poly(Acres, degree = 2) generates two columns, each of which has 1109 rows, but somehow feglm or data.table interpret that as a single column with 2218 rows.
Is there an easy fix for this, aside from pre-computing these columns in my main dataset?
The
data.table
internals ofalpaca
don't seem to behave nicely with formula elements that generate more than one resulting column. For example, if you usebs()
from thesplines
package orpoly()
from the basestats
module, both provide clean notation for a flexible polynomial function of one (or more, in the case ofpoly
) variables. However,feglm
seems to rely ondata.table
to pick these column expansions up correctly, and that does not yet work. For example, in a dataset where I have anAcres
variable that I'd like to get a 2nd order orthogonal polynomial expansion in, this fails:As you can see, the dataset has 1109 rows, and
poly(Acres, degree = 2)
generates two columns, each of which has 1109 rows, but somehowfeglm
ordata.table
interpret that as a single column with 2218 rows.Is there an easy fix for this, aside from pre-computing these columns in my main dataset?
Thanks.