VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.49k stars 1.92k forks source link

Loss for CATS algorithm #4293

Closed idriss445 closed 1 year ago

idriss445 commented 1 year ago

How the losses are being computed for CATS algorithm (the average and the last)

lokitoth commented 1 year ago

Hi @idriss445:

The CATS algorithm computes the loss that is reported using the get_loss() function, here: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/core/src/reductions/cats.cc#L58-L82.

What it does can be broken down into a few steps:

  1. Normalize and Discretize the set of actions (first by their interval, then into buckets when we floor the "ac"). What this does is turn the chosen action into an "action index" much like that of the standard CB algorithm
  2. This is used to compute the "center" position of the action - in other words, if we always chose the center, rather than some part within the bandwidth of the discretization.
  3. Then we compare the logged action with this center, and if the logged action falls within the bandwidth, we compute a loss (this functions equivalently to the indicator function).
  4. If we are computing the loss, we need to ensure that we properly account for actions whose bandwidth exceeds the min/max allowed, and then use this, along with the logged probability of choosing an action, to perform an IPS-like computation over the cost of choosing the action.
idriss445 commented 1 year ago

Thank you for you answer and for cats_pdf how the loss is being computed ? Using the same method ? and if yes how you choose the action from the range of cats_pdf ?

lokitoth commented 1 year ago

Happy to help!

The cats_pdf reduction does not do sampling, so there is no loss computed, as an action is not yet chosen when it is finished with predict/learn. If you look at https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/core/src/reductions/cats_pdf.cc#L64-L73, you can see that the learn method does not do anything, beyond passing through the data to its base learner.

The cats reduction relies on cats_pdf and sample_pdf for the PDF generation and sampling.

idriss445 commented 1 year ago

When I used cats_pdf there is a loss shown by VW so it is a result of cats_pdf and sample_pdf ?

lokitoth commented 1 year ago

Do you mean that in a stack where you have cats_pdf on top you are getting progressive loss reported? That could be from reductions beneath cats_pdf.

Would it be possible for you to share what command line you are using for this setup? It will make it easier for me to point to the specifics of your setup.

idriss445 commented 1 year ago

yes when I have cats_pdf the progressive loss is being reported. This is the commend line vw --cats_pdf 20 --min_value 0 --max_value 20 -b 20 --chain_hash --bandwidth 1 --power_t 0 --progress 10000 --learning_rate 0.1 --passes 2 -c --decay_learning_rate 0.0 --final_regressor model.pkl --data data.vw

idriss445 commented 1 year ago

Why the loss is not computed if the action is outside the bandwidth ?

olgavrou commented 1 year ago

Hi @idriss445

From the command line you shared vw will (this is a generalization):

when VW returns a prediction back, if you are using cats_pdf it will return a pdf over the actions and you have to do sampling over that pdf to decide what action you are going to finally use. If you use cats then the cats reduction does the sample from cats_pdf for you.

There are some resources and tutorials for you to take a look at here and here

idriss445 commented 1 year ago

Thanks a lot