trevorstephens / gplearn

Genetic Programming in Python, with a scikit-learn inspired API
http://gplearn.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.56k stars 274 forks source link

Matrix shaped features issue #289

Closed aadrian92 closed 1 year ago

aadrian92 commented 1 year ago

Hello,

I am trying to apply the gplearn functions to matrix shaped features (i.e. maybe you have the Close price of some assets at different points in time - the feature would be a matrix with the assets as columns and timestamp as index) Hence I am reshaping the matrix features in vectors and then for each custom operation that would need the matrix form I am reshaping back to matrix, applying the operation and then reshaping back to vector form, something like this: def func(x): mat = pd.DataFrame(x.reshape(col_len, -1).T) opmat = Feature(mat) opmat.ts_std() res = opmat.alpha.copy() return res.fillna(0).values.T.reshape(1,-1)[0]

You can see that I am using a global variable col_len which tells the number of columns (assets). The issue is that I would like to pickle the model and apply it on the same features with a different number of assets (so different col_len). But obviously the functions were created with a fixed parameter col_len. Is there anything you would suggest me to do to achieve this goal, please? I've tried to put col_len at the beginning of the input vector x or add delimiters to x were each row would end. Unfortunately it didn't work because the first value or delimiter was altered in the process (although I made sure they remain constant when applying the custom functions)

Thank you and I appreciate any help!

trevorstephens commented 1 year ago

The X's need to have the same features in both the training and testing sets, so doesn't sound like what you want to do is going to be possible sorry