VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.48k stars 1.93k forks source link

Interactions bug #711

Closed JohnLangford closed 9 years ago

JohnLangford commented 9 years ago

There seems to be a bug in the interactions code:

echo ' ' | ../vowpalwabbit/vw -q :: -f foo; echo ' ' | ../vowpalwabbit/vw -i foo

yields: vw: the required argument for option '--interactions' is missing

trufanov-nok commented 9 years ago

It looks like something got broken with one of last commits. Most probably this one. Most probably with interactions loading from saved model here

I'll try to debug, then will see.

trufanov-nok commented 9 years ago

The problem is in same commit but anther place

@eisber has added the code to not just restore interactions/pairs/triples from model to corresponding arrays, but also copy these values to vw.args. That let him to check here if interaction args are already initialized, so he can skip -q, --cubic, --interactions processing from command line.

The problem is that model keeps already "expanded and filtered" interactions. And we recently added " " namespace to wildcards. So one of interactions generated by -q :: is " ", many others is in form of " X". The " " blows out boost::program_options code as practically means no arg for parameter was passed to the app.

I don't like the idea of copying model interactions to vw::args at all as there may be thousands of them. Do we really want -q, --cubic, --interactions params ignored in case -i was specified without any warning? If so, may be we should add a warning and escape these args processing if -i is specified instead of touching vw.args?

JohnLangford commented 9 years ago

The problem with copying is due to the expansion of ':'. Maybe we should avoid expanding it? This would make --readable_model work better also.

-John

On 06/26/2015 04:46 PM, Alexander Trufanov wrote:

The problem is in same commit but anther place https://github.com/JohnLangford/vowpal_wabbit/commit/5d0a73d86493681e7809d0c7231bb5a99bc789e7#diff-6e06dc02b6816202f8189cbcb5c08434R221

@eisber https://github.com/eisber has added the code to not just restore interactions/pairs/triples to corresponding arrays, but also copy these values to vw.args. That let him to check here https://github.com/JohnLangford/vowpal_wabbit/commit/5d0a73d86493681e7809d0c7231bb5a99bc789e7#diff-37715d20272c8511f892a5777571886aR475 if interaction args are already initialized, so he can skip -q, --cubic, --interactions processing from command line.

The problem is that model keeps already "expanded and filtered" interactions. And we recently added " " namespace to wildcards. So one of interactions generated by |-q ::| is " ", many others is in form of " X". The " " blows out boost::program_options code as practically means no arg was passed to the app.

I don't like the idea of copying model interactions to vw::args at all as there may be thousands of them. Do we really want |-q|, |--cubic|, |--interactions| params ignored in case |-i| was specified without any warning? If so, may be we should add a warning and escape these args processing if |-i| is specified instead of touching vw.args?

— Reply to this email directly or view it on GitHub https://github.com/JohnLangford/vowpal_wabbit/issues/711#issuecomment-115875716.

trufanov-nok commented 9 years ago

that's doable. Perhaps we should just save "-q --cubic --interactions " args as string among other non-standard args? But this will break model binary compatibility again. Or I can just save vw.args content instead of vw.interactions values in model file. In any case - currently saved namespaces are not only expanded, but also filtered. This means if we restore wildcards we shall expand and filter them again. And probably print all warnings if something was filtered out or sorted. So user will get a warning if model was prepared with -q :: everytime he uses it unless we suppress such warning.

And this doesn't answer: should we ignore additional -q, --cubic, --interactions if -i is specified? It seems currently these values overwrites values restored from model file.

JohnLangford commented 9 years ago

On 06/26/2015 05:03 PM, Alexander Trufanov wrote:

that's doable. Perhaps we should just save "-q --cubic --interactions " args as string among other non-standard args? But this will break model binary compatibility again. Or I can just save vw.args content instead of vw.interactions values in model file.

I'm ok with just backward compatibility.

In any case - currently saved namespaces are not only expanded, but also filtered. This means if we estore wildcards we shall expand and filter them again. And probably print all warnings if something was filtered out. So user will get a warning if model was prepared with -q :: everytime he uses it unless we suppress such warning.

That seems ok?

And this doesn't answers: should we ignore additional -q, --cubic, --interactions if -i is specified? It seems currently these values overwrites values restored from model file.

A tough call. I'd say it should be ok with a warning because I could imagine people wanting to change a model artifact with different interactions.

-John

— Reply to this email directly or view it on GitHub https://github.com/JohnLangford/vowpal_wabbit/issues/711#issuecomment-115881964.

trufanov-nok commented 9 years ago

I'm ok with just backward compatibility.

To doublecheck, that means "we should just save "-q --cubic --interactions " args as string" or " save vw.args content instead of vw.interactions values"?

I'll make a PR tomorrow.

JohnLangford commented 9 years ago

Let me go a bit further---getting things to say :: rather than an argle-bargle long list seems desirable enough to even break backwards compatibility. But I'd like to settle this soon if possible because I want to drop a release and then maintain backwards compatibility indefinitely in the future.

-John

On 06/26/2015 05:03 PM, Alexander Trufanov wrote:

that's doable. Perhaps we should just save "-q --cubic --interactions " args as string among other non-standard args? But this will break model binary compatibility again. Or I can just save vw.args content instead of vw.interactions values in model file. In any case - currently saved namespaces are not only expanded, but also filtered. This means if we estore wildcards we shall expand and filter them again. And probably print all warnings if something was filtered out. So user will get a warning if model was prepared with -q :: everytime he uses it unless we suppress such warning.

And this doesn't answers: should we ignore additional -q, --cubic, --interactions if -i is specified? It seems currently these values overwrites values restored from model file.

— Reply to this email directly or view it on GitHub https://github.com/JohnLangford/vowpal_wabbit/issues/711#issuecomment-115881964.

JohnLangford commented 9 years ago

On 06/26/2015 05:35 PM, Alexander Trufanov wrote:

I'm ok with just backward compatibility.

To doublecheck, that means "we should just save "-q --cubic --interactions " args as string" or " save vw.args content instead of vw.interactions values"?

Let's save them as strings.

I'll make a PR tomorrow.

That's great. I'll aim to drop a release as soon as possible afterwards.

-John

— Reply to this email directly or view it on GitHub https://github.com/JohnLangford/vowpal_wabbit/issues/711#issuecomment-115894481.