Closed j6mes closed 7 years ago
It was just a mental note. Might need to fix this: the IDF values are computed on the entire dataset (including our hold-out dev set). When we have new documents, should the IDF come from the training set?
On Tue, May 30, 2017 at 8:54 AM, Andreas Vlachos notifications@github.com wrote:
What do you mean? If we are doing it on the training/dev data, we have do it on test, otherwise we shouldn't.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/j6mes/fnc-ensemble/issues/1#issuecomment-304802644, or mute the thread https://github.com/notifications/unsubscribe-auth/AHTV_Q-TA6RQ0pLE6hO9dszF_uUSGcV-ks5r-8sigaJpZM4Npri2 .
What do you mean? If we are doing it on the training/dev data, we have do it on test, otherwise we shouldn't.