Possible improvement to reduce user processing time

VicenteYago commented 5 years ago

This is not an issue properly but I dont now where post this.

I have observed that both Professor Hyndman and Pablo Montero and you among others frequently use the M3 / M4 data set to feed the meta-learning classifier.

The result of these incredible tools that you have developed depend on the amount of time series that are used to train the classifier.

If I am right, you have successfully participated in these competitions, so at some point you will have trained the models with all these time series.

My question is: Why do not you provide the configuration of the classifier? In this way users would have a huge advantage when using your software, saving precious computing time training the classifier with the series of M3/M4 competitions.

For example you could provide the classifier already trained or a kind of database with the following format:

Database M4
- timeSeries nº1
  - Featues = [entropy = 0.1, lumpiness = 0.003, ........]
  - Accurary = [theta = 0.6809820, snaive = 0.4978743, .........]
- .........
- timeSeries nº5000
  - Featues = [entropy = 0.7, lumpiness = 0.03, ........]
  - Accurary = [theta = 0.8, snaive = 0.3, .........]

What you do is as if someone develops an incredible neural network to detect faces and instead of giving it already trained with all the weights, the user is left to spend hours and hours to train the model with the heavy dataset with which developers have trained the net.

In terms of your terminology, a predetermined offline phase should already be given.

Maybe in the form of subsets of M3/4, for example a classifier for anual series other for hourly/quarterly etc.

thiyangt commented 5 years ago

Yes, the pre-trained classifiers will be uploaded to the package soon.

VicenteYago commented 5 years ago

Great, I'm glad to hear that.

thiyangt / seer

Possible improvement to reduce user processing time #4