Question: Programmatically creating splines and applying knots to new data

pydata / patsy

Describing statistical models in Python using symbolic formulas

Other

947 stars 103 forks source link

apply

xx1=build_design_matrices([x1.design_info], {"x":TEST_DATA.VARIABLE.values })

This works but of course requires manually creating variables or trying to programatically creating strings.

Is there anyway to do something like this patsy.cr(x, df=5)

and grab the knots to apply to new data using the same function cr()?

I'm not really an expert, so there's likely an oversight here.

First, do you need to know the knots for some reason? If not, I think the canonical way would be to do something like...

# Build the design matrix
x = np.arange(100)
dm = patsy.dmatrix('cr(x, df=5)', {'x': x})

# Apply design matrix to new data... 
new_data = np.arange(25, 75)
patsy.dmatrix(dm.design_info, {'x': new_data})

If you really want to know what the knots were, you could probably dig through the dm.design_info object and find it.

However, it may be a little easier to pull the CR class out of the cr stateful transform function.

cr = patsy.cr.__patsy_stateful_transform__()
cr.memorize_chunk(x, df=5)
cr.memorize_finish()

cr._all_knots

You could also apply to the new data using...

cr.transform(new_data)

pydata / patsy

Question: Programmatically creating splines and applying knots to new data #121

create

apply