Open Sandy4321 opened 3 years ago
Hi @Sandy4321 thanks for filing this issue. Could you expand a bit on what you would like to see here? Is the question about how to enable "quantile" loss when using VW in Python, or is it something else?
I ask description and more details how to do Quantile Loss for support vector regression or at least for usual linear regression
or at least code example for python pls
Sorry, I am still a bit confused about the specific question here. Switching the loss function to "quantile" (or others) in Python is done the same way as setting any command-line argument:
model = pyvw.vw(loss_function="quantile")
Is the question about better documentation for how to configure various vw options in Python? Or is it about how to think about Quantile Regression in general?
yes the question is better documentation for how to configure various vw options in Python?
it would be great to have full example from start to end for python quantile regression for example given such data file python code to use is :.....
predicted data is:....
mean absolute error is .... confidence intervals are: ....
since always something is not clear in general form description
some efforts done in this direction for example https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/vowpalwabbit.pyvw.html
from vowpalwabbit import pyvw vw1 = pyvw.vw('--audit') vw2 = pyvw.vw(audit=True, b=24, k=True, c=True, l2=0.001) vw3 = pyvw.vw("--audit", b=26) vw4 = pyvw.vw(q=["ab", "ac"])
but it would be really great to have full python code example
thanks a lot for taking care
I was able to find very limited examples in python this one https://vowpalwabbit.org/tutorials/python_first_steps.html is very concise
at least something like this , but for quantile regression
https://pypi.org/project/vowpalwabbit/
import numpy as np from sklearn import datasets from sklearn.model_selection import train_test_split from vowpalwabbit.sklearn_vw import VWClassifier
generate some data
X, y = datasets.make_hastie_10_2(n_samples=10000, random_state=1) X = X.astype(np.float32)
split train and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=256)
build model
model = VWClassifier() model.fit(X_train, y_train)
predict model
y_pred = model.predict(X_test)
evaluate model
model.score(X_train, y_train) model.score(X_test, y_test)
by the way in this link https://pypi.org/project/vowpalwabbit/ there is line at the bottom python/examples : example python code and jupyter notebooks to demonstrate functionality
may you clarify how to find this folder?
all right
I found this folder
then it would be great to share example for quantile regression in this style https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/python/examples/poisson_regression.ipynb
Yes, it is not entirely clear. It is referring to the dirs on that same location as that text file in the repository, which makes it extra confusing on that pypy.org documentation. For a slightly better experience, you can see those docs over here: https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python
The readme is in vowpal_wabbit/python/README.rst https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python
The python/examples would be in vowpal_wabbit/python/examples https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python/examples
Tests: https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python/tests
We also have these autogen docs: https://vowpalwabbit.org/docs/ https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/
In the SciKit case, the configuration options are passed in the same way as for pyvw:
So, if you want to have VWClassifier run with quantile loss, you would specify:
classifier_model = VWClassifier(loss_function='quantile')
#or
regressor_model = VWRegressor(loss_function='quantile')
Here is a deep link for VWClassifier and one for VWRegressor to the class documentation
I suspect that we probably will not make a specific tutorial for just quantile loss because it seems like there would be a lot of tutorials that only differ from one-another by the specific combination of options they use. Would a general tutorial about how to pass options to VW when using in Python in pyvw / scikit modes make sense here @Sandy4321, or alternatively a tutorial that explores the various things you can do in the context of regression specifically?
in this link https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/vowpalwabbit.sklearn.html#vowpalwabbit.sklearn_vw.VWRegressor I see no example though there is example for classifier
Would a general tutorial about how to pass options to VW when using in Python in pyvw / scikit modes make sense here @Sandy4321, or alternatively a tutorial that explores the various things you can do in the context of regression specifically?
yes would be great to have one
for example in https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/python/tests/test_sklearn_vw.py
def test_lrq(self):
X = ['1 |user A |movie 1',
'2 |user B |movie 2',
'3 |user C |movie 3',
'4 |user D |movie 4',
'5 |user E |movie 1']
model = VW(convert_to_vw=False, lrq='um4', lrqdropout=True, loss_function='quantile')
assert getattr(model, 'lrq') == 'um4'
assert getattr(model, 'lrqdropout')
model.fit(X)
prediction = model.predict([' |user C |movie 1'])
assert np.allclose(prediction, [3.], atol=1)
it is not clear at all about lrq='um4' why um4 , what is it um4 and how to find answer on this kind of questions for people who is not familiar with VW but only starting to learn VW
it is difficult to make google search for meaning for lrq since it is only 3 letters
even stackoverflow can not help https://stackoverflow.com/questions/44298795/one-time-vs-iteration-model-in-vowpal-wabbit-with-lrq-option
why um4 , what is it um4 and how to find answer on this kind of questions for people who is not familiar with VW but only starting to learn VW
The command-line options documentation is fairly sparse here, but here are some links to get you started with LRQ:
Putting together a more coherent list of issues that can be addressed from this:
great thanks for help
then lrq='um4' is not related to loss_function='quantile' in model = VW(convert_to_vw=False, lrq='um4', lrqdropout=True, loss_function='quantile')
my guess also regularization or L1 or L2 may be added to this line model = VW(convert_to_vw=False, lrq='um4', lrqdropout=True, loss_function='quantile') ? similar to --l2 use in
``
@${VW} --loss_function quantile -l 0.1 -b 24 --passes 100 \ -k --cache_file $@.cache -d $(word 2,$+) --holdout_off \ --power_t 0.333 --l2 1.25e-7 --lrq um7 --adaptive --invariant -f $@.model``
In general VW is really great package !!! but for python users would be crucial to have examples from start to end coded in python starting from reading data from file and ending by performance quality demonstration
since for python coder understanding make file like https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/demo/movielens/Makefile is impossible to do task..
Description
Link to Documentation Page