VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.48k stars 1.93k forks source link

Multiclass labels do not error when given a multilabel #2264

Closed jackgerrits closed 4 years ago

jackgerrits commented 4 years ago

The core of the issue here is that the number parser used for multiclass labels (and other places I'm sure) doesn't check that the number it parsed was the entire token. The parser should use the endptr strtol argument of strtol to check that the entire token was the number.

The label 1,2 | ... which is a valid multilabel is interpreted as a multiclass label of 1 but instead there should be an error as this is a clear user pitfall.

RituRajSingh878 commented 4 years ago

I want to work on the issue, but I don't know form where I should start.

And if I make some changes in files(.cpp or .cc) then How should I check it or test it? (we do for python files python setup.py install --userand python3 - pytest ./python/tests/)

Shivanshmundra commented 4 years ago

@RituRajSingh878 once you change files, you can build the solution just like you install Vowpal Wabbit from source and see how is this working. Here is the link to do that.

Shivanshmundra commented 4 years ago

@jackgerrits As I was looking into algorithms, I was looking into this problem. I could not work out the solution suggested by you because: