JuliaQuant / MarketTechnicals.jl

Technical analysis of financial time series in Julia
Other
127 stars 25 forks source link

Differences between MarketTechnicals.jl, Indicators.jl and TALib.jl / Differences between Timeseries.jl and Temporal.jl #93

Open femtotrader opened 7 years ago

femtotrader commented 7 years ago

Hello,

I wrote some months ago a Julia wrapper for TA-Lib : TALib.jl

So it's financial market technical analysis & indicators in Julia using TA-Lib library.

I think it's better to have a pure Julia Technical analysis package

but I wonder what are differences between and MarketTechnicals.jl from @JuliaQuant (@iblislin @milktrader ...) and Indicators.jl from @dysonance

Isn't there a way to merge our efforts to make Julia stronger in finance field?

(for technical analysis but also to build a better Julia backtester, paper trade, live trade application for stocks but also for cfd (with bracket orders, with trailing stop, ...)

Kind regards

iblislin commented 7 years ago

Hi @femtotrader . MarkTechnicals.jl is built on TimeSeries.jl. So you can leverage lots of handy functions from TimeSeries.jl. And... to be honest, MarketTechnicals is still in developement stage and lacks of some popular indicators. Indicators.jl depends on another time series array implementation, IIRC.

Isn't there a way to merge our efforts to make Julia stronger in finance field?

I think the blocker is the time array implementation. (But I want to keep away from the green-sward war...so "there are three similar TA packages" is ok for me.)

(for technical analysis but also to build a better Julia backtester, paper trade, live trade application for stocks but also for cfd (with bracket orders, with trailing stop, ...)

Please checkout the repo at https://github.com/JuliaQuant e.g. the backtester: https://github.com/JuliaQuant/TradingLogic.jl Those repos need some effort to polish up. I'm quite interested in TradingLogic.jl as well. But I personal free time is limited, I decided to spend most of my time in maintaining this project. Maybe you can ask @milktrader for the maintainership of repos in JuliaQuant, and plan for an elaborated blueprint. Once we have the fabulous blueprint/big piture in each project at JuliaQuant, I'm happy for contributing code. :)

femtotrader commented 7 years ago

Thanks @iblis17 for your answer

So there is (at least) 2 Julia time series array implementation.

(I was aware of this situation)

There is also JuliaTS on top of IndexedTables.jl from @JuliaComputing (pinging @shashi @JeffBezanson @ararslan ... )

What are the pros and cons of each implementation?

What features are required from a timeseries implementation (for backtesting but also in a live trading context)?

Is there some features overlap in these implementations? What features are still missing?

I really think that answering to these questions could lead us to find collegially a roadmap.

dysonance commented 7 years ago

@femtotrader Yes there are indeed a few package that whose functionality and objectives definitely intersect, and I think a more organized development plan could go a long way.

TimeSeries and MarketTechnicals were already around when I started work on Temporal and Indicators. The main reason I decided to start new packages altogether was that I didn't really care for some of the existing design decisions and semantics. In the hopes of starting a deeper discussion about things like this if we want to start combining efforts & collaborating, I'll give a few examples of things I remember wanting to change.

@iblis17 Not trying to start a sword fight by any means, but these are important design decisions in my opinion that have a big impact on how one works with these objects and functions. Would be happy to debate any of the above points at greater length. But long story short, these are the things that matter to me when interacting with the language (particularly in the REPL), and I would love to collaborate if there's shared interest in adding them to the JuliaQuant work. But if I'm just being overly picky and few agree, then I can just continue developing these packages in parallel in case any other users would appreciate the extra optionality.

dysonance commented 7 years ago

Oh, last thing — would love to be a part of the systematic strategy backtesting work. I have some really interesting ideas about how Julia could be a wicked awesome strategy research language, which I have been planning on making my next big project. Happy to discuss that effort with anyone that wants to work together on building this functionality.

iblislin commented 7 years ago
  • I didn't like that the TimeArray type was immutable, as I often want to change column names or the date/ time index of time series objects

There is rename function from TimeSeries. I usually operate time index via lag, moving ... etc. I'm curious about what the special feature you need is.

  • I wished it had indexing functionality similar to R's xts package (using strings to subset rows of data based on how the string is formatted), but TimeArray only allowed indexing rows numerically and used strings to index columns instead
    • This also made it considerably harder to treat these objects as Arrays with some extra features, which is what a time series really is at the end of the day

Since MarketTechnicals is fully dependent on the TimeArray type, my qualms above made me want something that implemented basic technical analysis algos on base* Julia types like Arrays, hence why the Indicators package

:-o I just bowse the source code of Indicators. I think I misunderstood that Indicators was built on top of Temporal.

I guess your main concern about building Indicators on Array is "I don't want to lose the power of Arrays, like ploting, maxtrix calculation... etc", right? There are two solutions for your anxiety: (1) keep using the Array (2) Build an elaborated time series type And... in the current stage, the sol (2) is still not ready unfortunately. But I still want to vote for sol (2). The possibility of a new time series type will be greater than the Array, just share your nice idea with the community (like the nice indexing features, (I'm not a R programmer, I am used to the indexing in Numpy and Pandas) ). Once the time series type overtakes the Arrays in the functionality, I think everyone will be happy to built everything on it.

femtotrader commented 7 years ago

One important feature for a timeseries implementation is to be able to resample them easily. That's the reason why I did TimeSeriesResampler.jl and TimeFrames.jl.

An other point, is that to be able to use a Julia TimeSeries implementation, we also have to think about streaming data, and how prices (for example) could be stored real time to a datastructure (similar to a circular buffer) and how such a datastructure could interact with a technical analysis library to avoid to calculate ALL values of an indicator (we just need to calculate/update latest values)

I suggest to have a look at TA-Lib source function.

iblislin commented 7 years ago

One important feature for a timeseries implementation is to be able to resample them easily. That's the reason why I did TimeSeriesResampler.jl and TimeFrames.jl.

A question flashed into my mind. Seems there are 2 ways to extend new feature on a package:

  1. send a PR to the orig repo
  2. create a new repo

@femtotrader What are the main pros and cons of creating a new repo like TimeSeriesResampler instead of merging those code into original project?

Make a single, feature rich project is more common in Python's community, e.g. Pandas contains both time series (with rich features, resampling included) and dataframe datatype just in a single project. I can install it and carefreely start my work. But seems Julia's community decides to split those feature into different project. I'm curious about the reason... Is the language designer intended?

femtotrader commented 7 years ago

I come from Python so the Julia approach was also surprising to me. But I understand that with Julia approach it's easier to reuse small "parts" without too much dependencies.

For example I'm pretty sure TimeFrames.jl could be reused in several projects... but problems could occur when many differents implementations for a same concept exists (date offsets, time series...).

Let's take an example. I wrote some months ago DataReaders.jl some months ago. It's a Julia library to get remote data via Requests.jl (from Google Finance). What kind of data structures such a library should support? DataFrames, TimeArray, Temporal.TS ?

If I want to support them all, I need to add a dependency for each library that looks like a table data structure !!! Should I create 3 more projects depending from DataReaders.jl

Fortunately DataStreams.jl could help, but in such a case, Temporal should support it. Same for TimeSeries.jl I'm not blaming @dysonance neither @milktrader . It may be an hard work.

But I just want draw you the landscape...

iblislin commented 7 years ago

Pros of creating a new repo If original author don't want your code, this is the only solution. You are free to create whatever you like!

I agree this point.

New features could require new dependencies, and so these dependencies won't be required in original project.

I will propose that just add new dependencies into the original repo if possible. (Not sure how the optional dependency can be implemented in Julia). Before you show me the TimeSeriesResampler, I did not know the feature has been done already. I did resample manually and it make me tired.

Let me show you the picture in my mind, e.g.: The TimeFrames.jl is the reusable component, so creating a new repo is the right decision I think. Then, implement resampler in the TimeSeries and add TimeFrame as a (optional) dependency. I want this feature being taken care of by the maintainer of TimeSeries. I want the dependencies being tested, integrated, and even adapting to the future change of TimeSeries. What will happed if we do not check the integration? I guess the conflict tragedy (like Javascript packages dependencies hell) make one crazy. I hope this strategy reduce the complexity for user (to search/to install) and still keep some degree of reusability (we have TimeFrames.jl, ya!).

@femtotrader Quote from your original post:

Isn't there a way to merge our efforts to make Julia stronger in finance field?

After previous deeper discussions, I think my answer is "I want a single, integrated (more "parts" as possible), feature rich killer project in backtesting/in blablabla". The first step in my mind is making TimeSeries perfect (add TimeSeries as a first class citizen in DataStreams, add some indexing feature ... etc).

dysonance commented 7 years ago

Haha wow, this discussion got really deep since I last opened it, glad to see you guys are equally interested in organizing our efforts around this! I'll try to hit on the major discussion points separately.

Repo/Project Organization

I was also generally quite surprised that so many efforts with similar objectives in the Julia community have appeared to splinter into separate packages. Coming from either R or Python, this can be somewhat overwhelming.

@femtotrader As for GitHub organizations, there is already JuliaQuant. Perhaps we should ask the owners of that org their thoughts on all this, and what it would take to get some of these various efforts put into that organization? But it's odd right, because TimeSeries is owner by the JuliaStats GitHub organization. How do we consolidate all this? Because TimeSeries has broader stats purposes than just finance, but a key domain for it (especially in Julia IMHO) will definitely be finance. So... who should get custody of all this?

Maybe JuliaStats can keep TimeSeries, and JuliaQuant could take ownership of Temporal (but perhaps with a more descriptive name specifying its relevance to finance in particular)? Or something along these lines? Again, these are things we should really ask the owners of these orgs their thoughts on.

Indexing

Is there any barrier preventing this feature from being implemented in TimeSeries?

@iblis17 Yes, the key problem with getting the string-based date indexing functional in TimeSeries is that right now the objects use strings to index columns, and use integers/arrays to index rows. This is why I wrote Temporal to index rows with integers, arrays, dates, or strings, and to index columns with integers, arrays, or symbols. (It could also be written to index both rows and columns with strings, but I personally think making Symbols specifically for columns allows for far cleaner, understandable, and robust syntax.)

I'm also not sure if TimeSeries permits the use of indexing expressions like X[1:10,1:2]? In most of the examples in the docs that I've seen (please correct me if I'm wrong here), the usage is generally of the form X["Close"][1:10]. This syntax differs from that used for Arrays, which is mostly what I meant. Yes, you can get the values member and index that as normal, but then if you wanted the TS object you'd have to pop it back into one manually. That just strikes me as too much work for a simple indexing effort, no?

Data

Obviously streaming data would be a huge win in the future, but I think it's important to not put the cart before the horse so to speak. This I think would be something we would want to focus on implementing after Julia v1.0 is officially released and we can make more assumptions about syntax. Not that I expect many differences, but I think there's lower-hanging fruit to grab in the meantime that would be equally useful for the bulk of use-cases. That being said, we should ensure development proceeds with an eye towards this objective being achieved down the line.

With Temporal.jl I added the ability to get data from the API's provided by Quandl, Yahoo Finance, and (more recently) Google Finance, specifically because these data sources focus specifically on time series data, which allows certain assumptions about the data structure received from the API calls. Having a specific package for each of these in particular sounds clunky in my opinion. But then, I added this functionality because, at least when I started, the Quandl package was made for TimeSeries.

I personally think it makes it a lot easier to get up and running having many of these features fully integrated into one time series package. What happens if the chief maintainers of some data fetching package stop maintaining it and/or your PR's to get new features incorporated take longer to get merged? These are just some concerns I'd have with outsourcing data acquisition capabilities to other projects. If we want time series data, let's just write some methods to get time series data.

Conclusion

Sorry for the stream-of-consciousness style dissertation this ended up becoming. Let's keep the conversation going :)

femtotrader commented 7 years ago

I wonder if @JuliaComputing have a roadmap to create timeseries implementation on top of IndexedTables.jl (like JuliaTS.jl was for NDSparseData). Pinging @shashi @JeffBezanson Why I'm asking that? Because if you have one stock with candlestick data you have 2 dimensional data. But in fact you are often trading several stocks, so you have 3 dimensional data. Being able to apply technical indicators on such a datastructure will be nice (Python Pandas had Panel, there's is now xarray )

milktrader commented 7 years ago

I am happy to grant access to developers within this organization. I agree it's best to consolidate our efforts here.

Please advise what projects, packages or groups you're interested in.

😄

milktrader commented 7 years ago

@dysonance can you elaborate what features in xts you'd like to see in TimeSeries. I'm very familiar with xts and authored TimeSeries. I'm also open to a new time series type (I experimented with some here in the past) so your input there would be welcome.

As far as getting data from Quandl, FRED and Yahoo, all those have been implemented at some point. The Quandl.jl api is in need of some refreshing so feel free to jump in. I would be happy to give you access privileges.

femtotrader commented 7 years ago

On my side I'm ok to move DataReaders.jl and TALib.jl to @JuliaQuant

I'm still not sure what is the best organisation for TimeSeriesResampler.jl TimeFrames.jl and TimeSeriesIO.jl

Given the fact TimeSeries.jl is part of @JuliaStats I think it could be the right place for them.

I just want to still have access to these repositories

My first step is to invite @iblis17 @milktrader in all these repositories as they are involved in development of projects belonging to @JuliaQuant and @JuliaStats Invite is also send to @dysonance because of his great work on Temporal.jl and all.

milktrader commented 7 years ago

Hierarchy of Packages this is over two years old but there is a Roadmap for JuliaQuant that would be a good place to continue this discussion.

iblislin commented 7 years ago

@milktrader Could you tell more about this repo https://github.com/JuliaQuant/FinancialSeries.jl ?

milktrader commented 7 years ago

It was basically a specialized time series for financial data. I had some doubts about its performance and as you can tell have not developed it in some time.

IIRC, there was an important utility for it in downstream packages (in theory)

femtotrader commented 6 years ago

Just a comment about streaming data...

You might have a look at this Python project http://matthewrocklin.com/blog/work/2017/10/16/streaming-dataframes-1 https://github.com/mrocklin/streamz

We should have something like this for Julia (with OHLC resampling)

femtotrader commented 5 years ago

Maintaining and adding new features (like https://github.com/dysonance/Temporal.jl/pull/30 and https://github.com/JuliaStats/TimeSeries.jl/pull/390) to both Timeseries.jl and Temporal.jl is a waste of time

I think we should really see how we could merge Temporal and Timeseries.

To achieve this task we probably need to write a wiki table with features of each library (ie avoid long text paragraph) to concentrate our efforts on highlighting differences, forces and weakness of each lib).

I personally think that something is missing in both libraries: N dimensions data.

For example OHLC prices of several stocks is a 3 dimensions data

We can even say that OHLC prices can be a 4 dimensions data

Something like Python library xarray

I think we should add in this wiki features that are not present in both library but are something we would like to see implemented.

Arkoniak commented 3 years ago

Sorry for reviving this thread, but I've stumbled on this issue and I think, that it can be solved in a rather simple fashion: let everybody uses the time format which they prefer, just add some conversion utilities. It is the same approach that is used in Tables.jl. Actually the only thing which is needed compared to Tables.jl is the support of time axis. I've made small demo, where I presented such an interface and made (through pirating) all types to support it.

You can find it here: https://github.com/Arkoniak/ProtoMarketData.jl

With this approach, users can switch from one format to another and we can have as many indicator/processing/whatever packages we want.

Sorry for directing, but @femtotrader @dysonance what would you say?

femtotrader commented 3 years ago

Sorry for answering so lately... I was quite busy

You should also have a look at https://github.com/dysonance/Temporal.jl/issues/48#issuecomment-865270412

Maybe a first start could be to define what features are requested for a very good Julia timeseries library.

The concept of timeframe is important to provide a nice API for resampling timeseries (a very old and limited implementation can be found at https://github.com/femtotrader/TimeFrames.jl).

Supporting statistics based on line algorithms https://joshday.github.io/OnlineStats.jl/latest/ / streaming features ... https://github.com/dysonance/Temporal.jl/issues/1 could be a great feature to have. Do we have for example in the Julialang ecosystem, implementations of technical indicators with online algorithm ? https://en.wikipedia.org/wiki/Online_algorithm

A last comment is that timeseries are not only tables... they are hypercubes... See https://en.wikipedia.org/wiki/OLAP_cube

kpa28-git commented 2 years ago

I like the idea of writing code generically and allowing easy use of whatever type people want to use to represent their time series.

I like the approach Indicators.jl takes of writing core functionality in Array{T} and other native types. Although with what @Arkoniak says this isn't even necessary, I think it's better to take this approach to prevent bloat, too many conversions, and unwanted dependencies (for example if I'm using Indicators with DataFrame instead of TimeArray). Currently Indicators depends on Temporal, but I think it would be better to add a separate package for the port (TemporalIndicators.jl or something).

Implemeting core functionality in native types is a good default approach IMO.

As far as resampling and TS specific operations, I've been doing everything with DataFrame and haven't had trouble implementing downsampling, downaggregation, shift, and other TS operations. I don't see the need for a separate time series aware package when you can use DateTime and ZonedDateTime as indexes/arrays in most tabular packages, but I think the user should have the option to use whatever types they want without sacrificing functionality.

smishr commented 1 year ago

I want to apply methods from MarketTechnicals.jl (and Indicators.jl) on TSFrame objects from TSFrames.jl.

I was wondering if as of early 2023 the TSFrame object, which inherits all the benefits from DataFrames.jl with many common time series specific operations built-in, is the best data structure in Julia to build technical analysis & indicators on? Or is there so much bifurcation in the Julia finance community that everyone prefers using different structures (TimeSeries, Temporal etc)? I feel that for those who were using the base DataFrame object for time series operations, TSFrame is a natural upgrade.

kpa28-git commented 1 year ago

I think TS-aware tables are fine, but to me they just add a bit of syntactic sugar. I don't see much added benefit to using them. A multivariate time series is a table where one of the columns, the index column, is a timestamp. You can just treat a time index like any other ordinal index in a table and do all the same operations without anything specific for time series.

I realized that there are basically two reasons I want a table for time series stuff, as opposed to for example an SOA-like thing:

  1. To perform a small number of operations specifically using timestamps (shifting, filtering, groupby, and maybe some others) while keeping the indexing intact without extra offsetting or related annoyances. In general, shifting and filtering only need to be done early in the pipeline.
  2. Column flexibility: ease of adding/removing columns, columns of different types without a union or common abstract eltype, symbol names for columns is nice

DataFrames gives # 2 automatically. For # 1, I made this small package for timestamp index methods: DateTimeDataFrames.jl. It doesn't use a special table type, it just requires you to specify either the name or integer location of the timestamp index (by default, it's called :datetime). It's very minimal/simple with only about ~400 loc with very simple methods. However, it does everything I need in that area. It's very easy to look at the code and see what the methods do; they are just small wrappers for DataFrame methods.

For doing things further downstream in the analysis pipeline that (further away from TS boilerplate), I write methods that work with AbstractArray. I find it the most expressive way to work, it still allows you to use table types to preserve time indices, and it tends to be the most flexible.

I recognize the potential value of having timeseries aware table types. I don't think DataFrame is a good choice for the core data though. The main advantage I see of DataFrames, aside from its maturity, is it's flexibility. If you want type-stability or more powerful dispatching, there are other table pacakges for that.

I think it would be cool to be able to specify the type(s) of columns and be able to write implementations of things for specific column configurations. Using Julia's type system could allow for nice code reuse along with stricter encoding of what your data is. I played around with StructArrays.jl to investigate this. You could define your row type as a struct and then dispatch on the subtype of StructArray you want for your implementation. This or some other type-aware table would be cool. I haven't looked into it enough, maybe when I get more time I will

femtotrader commented 9 months ago

I wonder if an incremental implementation of technical indicators similar to https://github.com/nardew/talipp couldn't be interesting for Julia. This kind of approach reminds me stuff such as https://github.com/joshday/OnlineStats.jl Any opinion?

PS : see https://discourse.julialang.org/t/incremental-technical-analysis-indicators/107844