Open selveszero opened 5 years ago
It's OK to use the same data for fine tuning and testing.
Hi,there is a big gap between the result of using the same data file for xst and xsu in test.py and s1['test']['path'] and s1['unlab']['path'] in data2.py and the result of using different training data and test data. How to treat this question? And which is better?
Looking forward to your reply!tks
@ZhihuiChen0903 Using the training data as the test data is generally going to be have much better metric results since the data is exactly the same. Isn't that the reason? So it's not necessarily bad in practice for an application, but for research purposes it's not good since it doesn't robustly tell you whether the model will work well on new data, even similar data. Also you might cause the model to overfit which would make it worse in application as well, though since this is unsupervised training hopefully that's less likely to occur.
Is it ok to use the same data for fine tuning and testing?
(Using the same data file for xst and xsu in test.py and s1['test']['path'] and s1['unlab']['path'] in data2.py)