VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.47k stars 1.93k forks source link

Standardize contextual bandit action id index to be 0-based instead of 1-based #1841

Open jackgerrits opened 5 years ago

jackgerrits commented 5 years ago

Currently, for contextual bandit action scores the index is 1-based. Elsewhere in VW all indices are 0-based. This should be standardized to 0-based as this is confusing to users and maintainers. This is a breaking change though, so the major version will need to be updated and communicated to users.

arielf commented 5 years ago

@jackgerrits what exactly do you mean when you say "elsewhere in vw all indices are 0 based" ? In all multi-class algos (several of them) class-id have always been 1 .. k. Thanks.

lokitoth commented 5 years ago

I think this is to do with overloading of class-id and action index in the CB stack. Since class-id is an internal implementation detail, in some sense, of CB, exposing it is unintutitive. See #1482

It might be worthwhile combining these into a single issue, since fixing it to be consistent across the different interfaces is likely to be a breaking change.