DavideNardone / MTSS-Multivariate-Time-Series-Software

A GP-GPU/CPU Dynamic Time Warping (DTW) implementation for the analysis of Multivariate Time Series (MTS).
MIT License
45 stars 8 forks source link

How to interpret output? #4

Closed krischer closed 5 years ago

krischer commented 5 years ago

I actually don’t understand what the output of the software is. All I could do is capture the stdout but I don’t really see how to get the actual results of the classification or subspace searches. Judging from the readme this package is intended to be used as a command line utility and not a library so I assume there must be a way to get the actual results of the computation.

Part of a review at: https://github.com/openjournals/joss-reviews/issues/1049

DavideNardone commented 5 years ago

Since the software is presented as a command line tool, actually there's no way to capture the output unless you redirect it on a file. On the other hand, as you could notice, the software is written in a modular way allowing anyone to use the API without being forced to use the command line tool. Recalling the other issues you opened about the undocumented API, i will provide it asap.

However, to understand better the two tasks i can refer you a my power point presentation which talk about these task but considering only mono-dimensional time series. You can find the ppt here.

Regarding the CLASSIFICATION task, you usually get these outputs:

id cls pred meas
0 gt:21 RI:21 163.679.337
1 gt:22 RI:22 111.290.268
2 gt:22 RI:22 102.833.076
3 gt:17 RI:17 77.027.679

...

where id is the actual time series to classify, cls is the class to which the object belong to, pred is the predicted class and meas is the closest similarity measure obtained for the current time series on the train set .

In the new release of the software i will add a new flag (verbose) where you can either partially or totally bypass these outputs information if you are not interested to and, therefore display only some of them or just the summary information about each K-fold Cross Validation iteration, like shown below.

k-th fold iteration Execution time for CLASSIFICATION w/ CPU using DEPENDENT-DTW: 181928.203125 ms Regular Accuracy is 60.705883 The Error rate is 0.392941 ... and finally:

Regular Accuracy mean is 53.440796 The Error rate mean is 0.465592

Instead, regarding the SUB-SEQUENCE SEARCH, you get the following outputs:

curr val diff. [0]: 3506306.000000 curr val diff. [1]: 3506306.500000 curr val diff. [2]: 3506307.500000 ...

where each row represents the DTW similarity measure found by sliding of 1 point the query time series on the target. From now on, with these new verbose flag, you can skip some or all of these outputs and display only the final information, that is:

GPU_GM version w/ min.index value 2073, min. value: 3504022.500000

that represents the minimum index found during the sliding process of the query time series on the target.

Anyhow, if you think it's necessary to add an extra explanation about the software outputs on the README, I will do it without problem.