Closed AvinashBukkittu closed 4 years ago
Merging #311 into master will not change coverage. The diff coverage is
n/a
.
@@ Coverage Diff @@
## master #311 +/- ##
=======================================
Coverage 79.91% 79.91%
=======================================
Files 133 133
Lines 11135 11135
=======================================
Hits 8899 8899
Misses 2236 2236
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 95f3d64...6d73a8c. Read the comment docs.
Have you reran the fine-tuning experiments? It would be best to ensure that we're doing it right, and that the results improve compared to when we did it wrong. We might also want to update the README with the new outputs.
Have you reran the fine-tuning experiments? It would be best to ensure that we're doing it right, and that the results improve compared to when we did it wrong. We might also want to update the README with the new outputs.
I did not completely re-run the experiments. I re-ran until the first _eval_epoch
call in training, checked the value of mode and current dataset in the iterator right after this call (the mode was eval
and current dataset was eval dataset, which is wrong).
Sure, I can update the Readme with the new results.
Hmm, interesting... The loss values now are much (one order of magnitude) lower than before starting from iter 250, and it seems a bit weird to me. Also, I checked the code at the commit where the README was modified to include the results: https://github.com/asyml/texar-pytorch/blob/e4b68188388dbaa07791528d066b410a6f838de7/examples/bert/config_data.py#L9
It seems that eval_step
was set to -1, which means no evaluation is performed until the end of training. In this case, the previous results were not incorrect (although the implementation was still faulty).
Hmm, interesting... The loss values now are much (one order of magnitude) lower than before starting from iter 250, and it seems a bit weird to me. Also, I checked the code at the commit where the README was modified to include the results: https://github.com/asyml/texar-pytorch/blob/e4b68188388dbaa07791528d066b410a6f838de7/examples/bert/config_data.py#L9
It seems that
eval_step
was set to -1, which means no evaluation is performed until the end of training. In this case, the previous results were not incorrect (although the implementation was still faulty).
We start noticing a stark difference for 300th iteration. Agree, previously, after first call _eval_epoch()
, the mode was still set to eval
This PR
Fixes #310