The problem to choose the best model

peter943 commented 4 years ago

Hi. I am a P.h.d working for the time series classification in UCR archives. I have read some paper about TSC based on deep learning such as MC-DNN、ResNet、Inceptiontime and so on. I have a confusion about how to choose the best model.

In my opinion, choosing the model corresponding to the minimum training loss may be a good choice, but the datasets in the UCR archives are relatively small, it is easy to overfitting. Hence the performance would be degenerate. The other choice is choosing the model corresponding to the maximum test accuracy, I have made some experiments on different nerual network，I found this model is superior to the model based on the minimum training loss. The reason is obvious beacause the model selection is based on testing set which is also a deficiency.

Moreover, I found many paper did not demonstrate this selection in their paper. So, if I want to write a paper, which model we choose is right. I hope my model can be accurate and resaonable. It may be hard to surpass some models based on maximum test accuracy if I use the model based on minimum training loss which is more reasonable. Hence, the model I propose might be denied by reviewers based on accuracy. Could you please give me some suggestions for the model selection. Thanks for your kind help.

TonyBagnall commented 4 years ago

hi, I have often found that deep learning papers are ambiguous about how they arrived at their model. I often suspect manual over fitting. Doing something like: "The other choice is choosing the model corresponding to the maximum test accuracy". This is wrong, its wildly biased. Do not do this! I think a lot of this goes on with deep learning, and it distorts the picture. I personally do not research deep learning algorithms for TSC,for the reasons you describe. I would suggest you contact hassan fawaz andd germain forestier, authors of the deep learning bake off for TSC. They are good guys and may have more insights than me.

peter943 commented 4 years ago

Ok, I will contact them later, I am grateful for your help.

time-series-machine-learning / tsml-repo

The problem to choose the best model #40