serval-snt-uni-lu / dsco

3 stars 3 forks source link

Interesting Paper - #1

Open balajikalluri opened 8 years ago

balajikalluri commented 8 years ago

Hello SerVal team,

Your work is indeed interesting. In the recent past, I have used SAX-VSM on my time-series dataset and it was useful in classifying instances of time-series embedded with recurring events (motifs) which are local to them. However, it fails to faithfully recognize those instances which lack distinct events (e.g. more steady level).

I was wondering if DSCo could help in such conditions in faithful classification?

On that note, I wanted to apply your model onto my dataset but I don't see a clear documentation/reference on where to start (e.g. where & how to feed by TS dataset)?

Any quick help would be much appreciated.

Best, KMB

daoli commented 8 years ago

Hello,

Thanks for your interest. I'm not sure about the scenario you've mentioned, but I encourage to try it out with DSCo.

I also acknowledge that the current documentation is very limited, but to use it, just look at the following scripts:

On the other hand, I'm working to further improve DSCo and more documentation will be available in the coming weeks.

Cheers, Daoyuan

balajikalluri commented 8 years ago

Hi Daoyuan,

Appreciate your earnest and kind response my friend.

BTW in connection with your N-gram Language Modeling for appliance electricity usage profiling, let me re-frame my question to you again.

Does your DSCo distinguish these two appliance states from input time-series energy signatures?

Also I would like to know what is the nature of your input time series dataset? Is it a matrix of rows & columns with each row being a labelled energy signature ? Pls throw some light on it.

Cheers & look forward.. KMB

balajikalluri commented 8 years ago

A quick couple of question from your N-gram language modelling paper:

Thanks in anticipation.

Best, KMB

daoli commented 8 years ago

Hello,

In our appliance profiling paper, the readings span a specific period (e.g. 200 minutes), and we focus on readings that showed at least some variations. So as long as appliances are consuming energy (turned on), their electric readings will vary over time and then we can profile them. In this sense there is no such "steady phase" from our perspective.

In our MLDM paper, we showed with three different setups that the sliding window size does not have a big impact on the overall classification accuracy. But you're right that the alphabet size is difficult to come to a perfect one. We have contacted the author of SAX and was told this is indeed a problem. They usually go with certain arbitrary alphabet size or use some optimization methods. But in our case, this parameter has not really been tackled with. We're still working to find a good optimization method.

For the calculation of fitness scores, please refer to this page: http://norvig.com/ngrams/. Our algorithm is a modified version of the code provided by P. Norvig.

Hope this helps.

Cheers, Daoyuan

daoli commented 8 years ago

Hi,

I've update the repo for a simplified and more accurate version of DSCo. Also, documentation is a bit more detailed. Please take a look and feel free to comment on the new version. Thanks!

https://github.com/serval-snt-uni-lu/dsco/tree/v2.0-ng

Cheers, Daoyuan