Closed LamAdr closed 7 months ago
Translated from https://github.com/vincentarelbundock/marginaleffects/blob/main/inst/tinytest/test-analytic.R
Fantastic!
There is a MIT license, an acknowledgment that the code is copied from
margins
, and a reference to Matt Golder's website in the R version. Should we carry those here?
Yes please
Also, the log test fails. This is because
self.model.model.exog_names
is['Intercept', 'np.log(x)']
andself.modeldata.columns
is['y', 'x']
. So this line empties the variable names. I commented out the test, but maybe you'd prefer to remove it altogether.
Are the numerical results changed, or this is just a labelling issueÉ
Are the numerical results changed, or this is just a labelling issueÉ
Not sure what you mean by numerical results since slopes()
is not returning. Since in this example, np.log(x)
is not accompanied by any variable from the columns, we get ValueError: There is no valid column name in 'variables'.
When such a variable is added into the formula, predictions()
is allowed to return and seems to take np.log()
into account (the results match those of R).
mod = smf.ols("mpg ~ am + np.log(hp)", data = mtcars.to_pandas()).fit()
nd = datagrid(newdata = mtcars, hp = [100, 200])
p = predictions(mod, newdata=nd)
slopes()
seems like it does not take the log into account:
f = "y ~ a + np.log(x)"
N = 10000
dat = pl.DataFrame({'x' : np.random.uniform(size=N), 'a' : np.random.uniform(size=N)})
dat = dat.with_columns(
y = np.log(pl.col('x')) + np.random.normal(size=N)
)
mod = smf.ols(f, dat.to_pandas()).fit()
nd = datagrid(newdata = dat, x = range(1,5))
res = slopes(mod, newdata = nd, by="a")
print(res)
outputs
┌─────┬──────┬──────────┬──────────┬───┬─────────┬───────┬─────────┬────────┐
│ x ┆ Term ┆ Contrast ┆ Estimate ┆ … ┆ P(>|z|) ┆ S ┆ 2.5% ┆ 97.5% │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ ┆ str ┆ str ┆ str ┆ str │
╞═════╪══════╪══════════╪══════════╪═══╪═════════╪═══════╪═════════╪════════╡
│ 1 ┆ a ┆ dY/dX ┆ 0.005 ┆ … ┆ 0.885 ┆ 0.176 ┆ -0.0626 ┆ 0.0726 │
│ 2 ┆ a ┆ dY/dX ┆ 0.005 ┆ … ┆ 0.885 ┆ 0.177 ┆ -0.0626 ┆ 0.0726 │
│ 3 ┆ a ┆ dY/dX ┆ 0.005 ┆ … ┆ 0.885 ┆ 0.177 ┆ -0.0626 ┆ 0.0726 │
│ 4 ┆ a ┆ dY/dX ┆ 0.005 ┆ … ┆ 0.885 ┆ 0.177 ┆ -0.0626 ┆ 0.0726 │
└─────┴──────┴──────────┴──────────┴───┴─────────┴───────┴─────────┴────────┘
Where 0.005 is the slope with respect to a
.
I'd be happy to investigate that further and write a nicer issue if thats valuable.
@vincentarelbundock ready for review
Looks fantastic, thanks!
Yes, please open a separate issue for the problem you identified with a detailed example (maybe mostly copying over what you ahve here already).
Feel free to suggest a solution if you find one.
Translated from https://github.com/vincentarelbundock/marginaleffects/blob/main/inst/tinytest/test-analytic.R
There is a MIT license, an acknowledgment that the code is copied from
margins
, and a reference to Matt Golder's website in the R version. Should we carry those here?Also, the log test fails. This is because
self.model.model.exog_names
is['Intercept', 'np.log(x)']
andself.modeldata.columns
is['y', 'x']
. So this line empties the variable names. I commented out the test, but maybe you'd prefer to remove it altogether.Please let me know if you want anything changed!