Automatic testing - Githubissues

holgerteichgraeber / TimeSeriesClustering.jl

Julia implementation of unsupervised learning methods for time series datasets. It provides functionality for clustering and aggregating, detecting motifs, and quantifying similarity between time series datasets.

MIT License

82 stars 22 forks source link

Automatic testing #91

Open holgerteichgraeber opened 5 years ago

holgerteichgraeber commented 5 years ago

Implement automatic testing for clustering methods and extreme value selection.

holgerteichgraeber commented 5 years ago

I am thinking about the following case and was wondering if you have run into something similar with CapacityExpansion automatic testing, @YoungFaithful ?

When we do run_clust with kmeans or kmedoids, it will always initialize randomly. there is always a chance that even with large number of initializations, we do not get the same clusters in the end.

Do you use kmeans or kmedoids as a base step for CapacityExpansion testing, or only hierarchical?

holgerteichgraeber commented 5 years ago

note for future: Potentially use https://github.com/JuliaPlots/VisualRegressionTests.jl

holgerteichgraeber commented 5 years ago

Let's name the test files in the test folder similar to the files in the src folder.

I am keeping a running list of tests that still need to be implemented. If possible, these tests should be without loading any data (see for example tests in utils.jl), but rather test the functionality based on simple examples.

entire intraperiod segmentation file
load_data.jl : combine_timeseries_weather_data()
utils.jl : run_pure_clust() - here, let's also think if there is a more descriptive name
optim_problems.jl (Holger)

holgerteichgraeber commented 5 years ago

@YoungFaithful: What was the issue in CapacityExpansion with the number of significant digits of the input data? Where did it show up, was it a small error in the optimization when you were running your tests, or something more significant? Somehow kmeans + centroid does not perform reliably, if we use the ] test ClustForOpt, it fails, and I guess it is because of some rounding/ machine error, or alternatively due to random initializations.