inconsistent results when using vw daemon

mustaphabenm commented 2 years ago

Describe the bug

I'm trying to get infer on a vw model trained with cats_pdf using a daemon. the problem is that I'm not getting the same result for a specific context : you can see below that out of 100 trials I got 2 different results

1-28.591745:0.006451613,28.591745-29.11138:1.5459961,29.11138-32:0.006451613

1-29.07612:0.006451613,29.07612-29.595755:1.5459961,29.595755-32:0.006451613

How to reproduce

1 - train a vw model :

vw --cats_pdf 64 --min_value 1 --max_value 32 -b 20 --chain_hash --nn 120 --dropout --progress 10000 --learning_rate 0.00003 --data dummy_data.vw --final_regressor model_new_params.pkl --bandwidth 0.25981749230238499

2- create a daemon :

vw --daemon --initial_regressor model_new_params.pkl --epsilon 0.2 --chain_hash --quiet --testonly --port 8888 --num_children 10 3- communicate with the daemon : python ping_daemon.py

Version

9.1.0 (git commit: bd4998c9e)

OS

Linux files.zip

Language

CLI

Additional context

To try it you can download the files from files.zip (you can start from step 2)

martininstadeep commented 2 years ago

We found that the reason for inconsistency comes from using dropout. Without dropout the predictions are consistent.

However, shouldn't there be an option to turn off dropout during inference?

pmineiro commented 2 years ago

However, shouldn't there be an option to turn off dropout during inference?

Yes I'm surprised this doesn't happen already, and would consider it a bug.

For --lrq dropout activating --testonly produces mean field results. However for --nn --testonly does nothing.

VowpalWabbit / vowpal_wabbit