Closed r-angi closed 3 years ago
Yes, it is caused by --save_resume.
Other way to reproduce: vw --cb_adf -d train.dat --invert_hash model.readable -f model.bin => model.readable has feature names vw --cb_adf -d train.dat --invert_hash model.readable -f model.bin --save_resume => model.readable is without feature names.
Have to be investigated.
I'm not an expert on this code base, but I tried to take a stab at figuring out some link between save_resume
and invert_hash
. My initial thought is there could be something here on L115 that is causing the size of this feature space to be 0? I could be completely off though if the readable model isn't actually created from this audit_regressor.cc file and the inverting of the hash values to strings isn't added to the strings in audit_regressor_interaction()
.
features& fs = ec.feature_space[(size_t)*i]
https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/audit_regressor.cc#L115-L126
Problem Description
I have trained an online contextual bandit using
--save_resume
. I was attempting to debug interaction coefficients, and unfortunately--invert_hash
forcb_explore_adf
with interactions displays as the--readable_model
output and does not include the string names for features. I've looked through the source code, but could not determine if this is intentional or a bug. If it is intentional, the documentation on--invert_hash
might need to be updated.It may be due to the
--save_resume
argument, but I have no evidence to back this up.To Reproduce
train.dat
(a 2 example training set)Running this command provides me with a binary model output:
Expected behavior
Running the following command should produce a model output with feature names as shown in the
--invert_hash
documentation.The same thing happens with -k present or not, vw also mentions
using no cache
in output. The same thing happens with -t present or not.Observed Behavior
Output in
model.humanreadable
is the same as if I replaced--invert_hash
with--readable_model
:Additional Context
One thing I did notice is if I use the
-a
audit parameter. I do get unhashed string names back. In most places in the code where I sawall.hash_inv
I saw it in conjunction with theall.audit
boolean which makes me wonder why audit would work correctly but invert hash does not.Here is the terminal output using the
-a
parameterEnvironment
VW version 8.8.1 on OSX using cmd line interface.