brandtg / stl-java

A Java implementation of STL
Apache License 2.0
21 stars 17 forks source link

Usage of exemple #10

Open dmpvost opened 8 years ago

dmpvost commented 8 years ago

Hi :)

Sorry, I would like to use the exemple, I try tou launch it, but I don't understand why I cannot. I have try to run the main of StlPlotter in test, but nothing appear. Maybe I have miss something, could you help me ?

thanks a lot, Vincent

hntd187 commented 8 years ago

Can you tell us what you exactly do?

dmpvost commented 8 years ago

Thank you for your fast answer ! Okay, I'm using intelij idea. Maven have find all dependency approximatly.

I have select the file StlPlotter.java and try to run it with intellij. I put a simple system.out.print for see something.

  public static void main(String[] args) throws Exception {
    List<Double> times = new ArrayList<Double>();
    List<Double> series = new ArrayList<Double>();
    List<Double> trend = new ArrayList<Double>();
    List<Double> seasonal = new ArrayList<Double>();
    List<Double> remainder = new ArrayList<Double>();

    System.out.print("START\n");
     ........

And I obtain simply : ``

"C:\Program Files\Java\jdk1.8.0_92\bin\java" -Didea.launcher.port=7536 "-Didea.launcher.bin.path=C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 2016.1.1\bin" -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.8.0_92\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\cldrdata.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\jfxrt.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\nashorn.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\sunpkcs11.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\jce.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\jfxswt.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\resources.jar;C:\Program Files\Java\jdk1.8.0_92\jre\lib\rt.jar;S:\stl-java-master\out\test\test;D:\Users\vosterma\.m2\repository\com\fasterxml\jackson\core\jackson-databind\2.5.1\jackson-databind-2.5.1.jar;S:\stl-java-master\out\production\main;D:\Users\vosterma\.m2\repository\org\apache\commons\commons-math3\3.2\commons-math3-3.2.jar;D:\Users\vosterma\.m2\repository\org\jfree\jfreechart\1.0.19\jfreechart-1.0.19.jar;D:\Users\vosterma\.m2\repository\junit\junit\4.10\junit-4.10.jar;D:\Users\vosterma\.m2\repository\org\testng\testng\6.8.7\testng-6.8.7.jar;D:\Users\vosterma\.m2\repository\joda-time\joda-time\2.8.1\joda-time-2.8.1.jar;D:\Users\vosterma\.m2\repository\org\jfree\jcommon\1.0.23\jcommon-1.0.23.jar;D:\Users\vosterma\.m2\repository\com\fasterxml\jackson\core\jackson-core\2.5.1\jackson-core-2.5.1.jar;C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 2016.1.1\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain com.github.brandtg.stl.StlPlotter START

So it's running.. but nothing more. What I'm doing wrong?

hntd187 commented 8 years ago

That code does nothing. So it looks like it's running exactly like it's supposed to. You have to actually provide data and run the code. The main page documentation has a basic example of it, but also the tests should give a better example.

dmpvost commented 8 years ago

Ok, so the exemple provide the same graphique interface with the "sample-timeseries.json"? It's possible to execute it with the current code of github? or I must add some modification

hntd187 commented 8 years ago

tu parle francais, au lieu? tu as éditer le tests en à etre votre data, ne "sample-timeseries.json" puis executez le tests. votre data doit etre en la simil le format a sample-json.json. ou executez il de le command line comme ici. Pas dedans IntelliJ. https://github.com/brandtg/stl-java/issues/8 desole, je ne parle pas bien francais

dmpvost commented 8 years ago

Yes I speak french :) but I prefer to speak english here thank you :).

For the moment I would just try to run the current exemple and play arount it.

So, I have an other question around the subject I'm trying to do some detection of abnormal event. All my data are stored in Elasticsearch. For the analyse, I'm using SPARK and a library spark-timeseries. I would like to see if STL-java could be a solution for analyse my log. Someone said me that, maybe I can try with STL and after that use Tukey method for the detection. My timeseries are logs activities of telecommunications. So I should find some series of days and week. Currently I have 2months of data. Around 6000 points. Do you think that it's a good idea?

hntd187 commented 8 years ago

My apologies I'm trying to help as best I can, I thought my poor french might be clearer. You could use this for that purpose. @brandtg uses this in Spark jobs, but you could have to load the data into arrays like show in order to be able to run them trough the algorithm. The problem with running it through spark is since it's a time series you can't easily parallelize it since the order of the data is important and partitioning it might effect the outcome. Are you using Scala/Java? Or something else?

dmpvost commented 8 years ago

No problem, thank to help me ! I use JAVA. I'm a student, it's my final project.So I'm new in elasticsearch, big data, spark, timeseries... but I going to do the work !

I'm trying this with spark: https://github.com/sryza/spark-timeseries

I was writing a little with sryza who said me that I can try that maybe. (STL and after tukey)

So my idea for a beginning is to simply create a timeserie with my data, simply count all hit that I have by hours. I will have 24 values by day in my timeseries and 2months of data. And use STL and see what happen..

If I'm lucky, I will get some result ^^

dmpvost commented 8 years ago

Ok, the exemple work on my personnal computer. I will try to prepare the data with spark-ts now

hntd187 commented 8 years ago

Okay, if you can provide a sample of the data, I might be able to help you prepare it.

dmpvost commented 8 years ago

Good! Thanks, I come back when I have something

dmpvost commented 8 years ago

Ok, finnaly I used spark-SQL for the moment for creating the sample.

But I have a question: I don't have all points in my series, sometimes they are missing because the request returns me nothing for some date. Is it important for java-STL ? Or this lib add the missing point? Maybe this algo know the time sequence and add missing point ? If It's right, I'm near to finish the sample. if not I'm still working on it :)

dmpvost commented 8 years ago

Okay, @hntd187 here is the sample :) Not so nice, but generally my observation are going to look like this.

sample-timeseries.txt (just rename to .json)

For create the sample, I used spark-sql. I try to spark-ts but I think that is a problem in the library, an issue is open for java. I will look for that later too. Ps : data are phones's logs activities over network. It can help to understand the time series.

Okay, first try, but I'm sure it's wrong:

We can say that I have maybe 2 types of series ? Day during a week(except weekend). So 24hours? But generally the season is a week of 7 days

Start by trying a day:

`final StlDecomposition stl = new StlDecomposition(7); final StlResult res = stl.decompose(ts, ys);

    final File output = new File("TEST-seasonal.png");
    final File daily = new File("TEST-stl-daily.png");

    StlPlotter.plot(res, "TRY DAY", Day.class, daily);
    StlPlotter.plot(res, output);
    StlPlotter.plot(res);

    final File exists = new File("TEST-stl-decomposition.png");

    StlPlotter.plot(res, "TRY");`

test-seasonal

test-stl-daily

And try by hours :

` final StlDecomposition stl = new StlDecomposition(24); final StlResult res = stl.decompose(ts, ys);

    final File output = new File("TEST-seasonal.png");
    final File hourly = new File("TEST-stl-hourly.png");

    StlPlotter.plot(res, "TRY HOURS", Hour.class, hourly);
    StlPlotter.plot(res, output);
    StlPlotter.plot(res);

    final File exists = new File("TEST-stl-decomposition.png");

    StlPlotter.plot(res, "TRY");`

test-seasonal

test-stl-hourly

I'm going to look more in details tomorrow.

dmpvost commented 8 years ago

I will provide a better timeserie tomorrow

hntd187 commented 8 years ago

It's looking pretty good so far though!

dmpvost commented 8 years ago

I don't have any '0' value in my timeserie and normally I have a lot. I think, I must complete it

dmpvost commented 8 years ago

okay, I'm back!

OXOserieGlobal.txt This is a good serie.

And now... how configure this algorithm? How understand the graph?

I repeat my goal : detect abnormal event. Data represent usage of phone during hours. But I have a pattern of a day, (24hours) and during week, because it's different during the weekend.

Problem happend generally during the night we can say.

hours-seasonal hours-stl-daily

So, I think, I can do nothing with that, isnt it ? Maybe I must create a serie of hours without weekend?


If I use STL decomposition on days, it's look better, but it's going to be to late for detect somehting.

days-stl-daily days-seasonal

Can you help me to understand this graph? thanks

dmpvost commented 8 years ago

Ok! :)

I have find the pattern. My pattern is 168hours, one week,

hours-stld-168h-stl-hours

And now... what can we say about the remainder?

I think it's to much? how can I improve that ?