Open xieliaing opened 6 years ago
Hi @xieliaing,
Many thanks for opening the issue!
I completely agree with your remarks. But answering your question: Yes, there are plans to add support for time series analysis in the near future. Currently the framework is already able to perform classification of time series through Support Vector Machines, Hidden Markov Model and Hidden Conditional Random Fields classes, but it is true that it may not still be able to perform regression and more general analysis like in ARIMA models.
As such, may I ask exactly which method you would be more interested in seeing the framework support in the future? If possible, please add some references for models you think would be the most useful in the real world so we could prioritize what should be implemented first.
Best regards, Cesar
Also, if you would like to contribute a method yourself, please do not hesitate to send in a pull request!
Best regards, Cesar
Ooh interesting. I'll follow this thread closely. As @cesarsouza says, a little clarity on what time series analysis would particularly interest you would be good to know.
If it's of any interest, I was planning to add EWMA and exp weighted moving variance-covariance calculations to the framework at some point in the next couple of months. If you'd really like it, I can probably add them sooner (or you're very welcome to contribute if you'd prefer as César also mentioned).
@cesarsouza @AlexJCross Thanks for the response. Yes, time series analysis is a big topic and we can choose something small and most frequently used as starting point.
I see these as two categories:
I would love to contribute but I am not professional programmer, so my code may not be as good as one from a professional one.
I would love to contribute but I am not professional programmer, so my code may not be as good as one from a professional one.
Well most of my code is terrible but that doesn't stop me :-)
Seriously though, if you'd like to add something, please do! Even if it's just the guts of an algorithm, I'm sure it would be very useful to a lot of people. I don't have much knowledge on the categories you listed above (I'm no statistician); I will, however, add EWMA this week.
Thanks, Alex
I am not good at class design, but I can follow the design of StatsModel in Python. A good start point is the tsa\stattools.py. I can translate that into C#. Many building blocks are already available in Accord.Audio which I still need to get familiar with first.
Hi @xieliaing,
That could be a good start! In fact, the StatsModel library is under the 3-clause-BSD license, so it should be fine to base implementations on it. It could be possible to start with simple translations of the basic machinery to support hypothesis tests such as Ljung–Box, ADF, KPSS. Most of the methods that will be used in those tests would likely be implemented as static methods in some static class anyway, so the design would not differ very much from the Python file you mentioned.
For the tests themselves, I can wrap them into classes later, following the rest of the style of the framework, adding them to a new TimeSeries namespace under under Accord.Statistics.Testing.
Please don't worry about design at this stage! It would be better to start with working implementations first so we could write enough unit tests to be able to change the design at will later, without risking introducing bugs.
Best regards, Cesar
Sounds like a plan. What I can do now is providing C# implementation of these functions first, test them. Later these functions can be incorporated into a well designed TimeSeries class.
@cesarsouza Any guidelines on pull requests and check in code? Code style?
@xieliaing Yes, the contributing guidelines are here. For code style, I would say it would be preferable to stick to the original formatting guidelines provided by Visual Studio (i.e. the formatting applied when hitting Ctrl+E, D) since it is probably the most common format out there.
@cesarsouza Just to clarify, what I can do is
Is this process right?
Hi @xieliaing,
Almost! For step 4, please submit your pull requests against the development branch of the project instead of master.
Regards, Cesar
Well, actually, to tell the truth, I am not completely sure whether the other commands are completely on spot. But do not worry - please do in a way that is easier to you, and I can take care of merging the code afterwards.
In my own experience, the ideal experience would be if you could:
But as I said before, please do it in the way that would be the easiest to you. I can take care of the merges and pull requests afterwards if there is need.
Regards, Cesar
@cesarsouza Thanks for the detailed explanation. I will follow the above steps. Thanks for reminding me the development branch
Hi @xieliaing,
You've probably seen me doing it on this thread but if you include the text "#884" or "GH-884" on any of your git commit messages, it will automatically get linked to this issue. You don't have to do that(!) but it definitely makes it clear in two year's time why a commit was made if it can be tied to an issue.
On the pull request thing, César said it best; do whatever is easiest for you in the first instance. Pull requests are not too bad once you get the hang of them. Octocat have a really nice tutorial and a repo (called Spoon-knife) you can practice on but any questions, feel free to ask. https://help.github.com/articles/fork-a-repo/
Best, Alex
@AlexJCross You act fast, super! How to do CR on GitHub?
Hey @xieliaing,
My git terminology is not all that good I'm afraid. Is CR code review? I know PR as pull request.
Anyways, if it's one of those, once you create a branch in your forked repositories (e.g. xieliaing-Samples-TimeSeries) and push some code to GitHub, you should see a button on the page for pull request. Choose your branch to merge from and then Accord's development branch to merge to.
In terms of code review, GitHub will provide a breakdown of the files changed/added between the two branches once you send a PR so your work can easily be reviewed in that way.
In terms of reviewing, I am happy to have a cursory look over but my knowledge of stats is not all that good so I might need to defer to César for this.
Best, Alex
@AlexJCross reviewed the tutorial you mentioned, very helpful. Basically, here is what I need to do:
Thanks @xieliaing, I've added a review to the pull request you had created!
@cesarsouza can you merge this thread to GH-884?
time series has a wide range of real world applications and is used literally by all businesses. Currently there is no formal support for time series analysis. Is there any plan for this?