LIBOL / SOL

Library for Scalable Online Learning
http://sol.stevenhoi.org
Other
97 stars 38 forks source link

getting the predict class ? #1

Closed anujgupta82 closed 7 years ago

anujgupta82 commented 8 years ago

I am using python wrapper to use LIBSOL. Once I train a model (python python/libsol_train.py -m arow.model data/a1a arow2.model), how do I access/return the class predict by the trained model for a unseen data point ?

I only see test (https://github.com/LIBOL/LIBSOL/blob/master/python/libsol_core.py#L170) which internally calls _LIB.lsol_Test(https://github.com/LIBOL/LIBSOL/blob/master/python/libsol_core.py#L186)

yuewu001 commented 8 years ago

the third parameter "output_path" is the data path to save the predicted classes and scores. We are also working on new wrappers so that you can get the predicted classes and scores from test data of numpy arrays or scipy sparse arrays.

anujgupta82 commented 8 years ago

third parameter "output_path" is not working. Please see #3

anujgupta82 commented 8 years ago

Can I know what kind of timeline can I except for predictions/scoring on numpy arrays or scipy sparse arrays ?

yuewu001 commented 8 years ago

i'll release a new version in the dev branch at the end of this week, which features the usage like sklearn.

anujgupta82 commented 8 years ago

Cool. Couple of suggestions (based on PassiveAggressive from sklearn) you may consider:

1) ability to add 'class weights' to handle class imbalance while initializing the model 2) fit() : do batch training on data if need be. Returns model and training loss incurred. 3) partial_fit(): update the current model while fitting a new data point 4) predict() : predict class/label for given data point(s) 5) score() : accuracy score on the test set 6) confidence() : confidence in prediction (usually distance from decision boundary) 7) access model parameters 8) should not necessary read data and write back results to files. ability to accept and return Numpy arrays will be of great help

These will be of great help to anyone who has been using PA from sklearn

yuewu001 commented 8 years ago

thank you so much for your kind suggestion. You can track the 'cython' branch, since the updates in that branch is mostly similar to what you have mentioned. As to the class imbalance problem, we are still thinking on a algorithmic way for the problem. Your suggestion is still of great value.

yuewu001 commented 8 years ago

@anujgupta82 i have majorly finished the new python interfaces in the cython branch. I'm looking forward to your suggestions and bug reports if you have time to have a try. thanks