John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm
57
stars
0
forks
source link
Using the same cache file for training and predicting results in inconsistent average loss (results) #6
Open
viveksck opened 10 years ago
Brief description of the problem:
Step 1: Train a model using some data:
Result:
average loss = 0.0286955
best constant = 0
total feature number = 26144934
Step 2: Use the dumped model and predict using that model on the exact same training data
Result:
average loss = 0.144631
best constant = -4.91111e-06
total feature number = 65362336
Note the average loss.
Step 3: Use the dumped model and use that model to predict using the model on 4 epochs of the training data.
Result:
average loss = 0.0342106
best constant = -1.22777e-06
total feature number = 261449344
The average loss reported across 4 epochs must exactly be the same as reported in Step 2 , but it is not. This is the inconsistency.
However if I repeat the above steps by using a copy of the training file, then we see things are consistent
Proof: Copy the training file to a new file: rami_eng_train_2.vw
Step 4: Predict on the training data
Result:
average loss = 0.144631
best constant = -4.91111e-06
total feature number = 65362336
Step 5:Predict on 4 epochs of the same training data
Result:
average loss = 0.144631
best constant = -1.22777e-06
total feature number = 261449344
Now we see the same average loss in Steps 4 and 5 as expected.
Please let me know if you need the full outputs. I dont see how I can attach files here.