mihakralj / QuanTAlib

C# TA library for real-time financial analysis, offering ~100 indicators. Available on NuGet, Quantower compatible. Ensures early validity of calculated data, calculation accuracy tested against four TA libraries.
https://mihakralj.github.io/QuanTAlib/
Apache License 2.0
44 stars 11 forks source link

New wheel? #1

Closed DaveSkender closed 2 years ago

DaveSkender commented 2 years ago

Hey @mihakralj, I see you’ve forked my skender.stock.indicators library recently, but have started on a path for a new library.

Any feedback on how we can do better to meet your needs? Or, is your library intended to be uniquely for QuantTower?

We’re always looking for contributors too, if you’re interested in joining us.

mihakralj commented 2 years ago

@DaveSkender, I am flattered you noticed my feeble proto-library within 10 hours of me publishing it! I feel like a kindergarten kiddo tapped on the shoulder by a PhD scholar because I peeked into their thesis paper! :-)

To answer some of your queries:

There is one design improvement that I am still tinkering with and could be a potential collaboration/contribution: how to abstract C# indicators so the usage syntax would mimic PineScript code as close as possible. There are so many useful and relevant strategies written in PineScript - and converting that code into C# is not elegant at all...

DaveSkender commented 2 years ago
  • the core calculation classes are all made with a fundamental design goal to allow updates of the current value; indicator classes should be able to ingest a new OHLCV of a currently-forming candle, and modify the last calculated value in a list without triggering a full recalculation.

I'm definitely looking to do this as well in v2 of or library (see roadmap). It's a solid plan and good for streaming use cases. You'll find though that this is easier to do in some use cases and really hard to do in others given how some rely on more than one prior value.

  • for a similar reason, I am not hiding initial values with NaNs - I am alculating them correctly, even if a total length of an array is shorter than the minimum recommended period.

Yah, I'm not hiding these either. Some people are okay with keeping them in. Though I'm using null instead of NaN, but only for incalculable values (not for hiding warmup periods). We're also not enforcing minimum recommended periods anymore (we used to). Are you using NaN for compatibility reasons, with QuantTower?

  • My real underpinned desire is to build some more advanced indicators, once I have the 'factory' established. See an example of that in HEMA (Hull-EMA hybrid indicator) and in my JMA contribution to Pandas-TA

Very cool stuff. I'm working on two features that will make this easier for users 1) make it easier to do indicator of indicators (though you can do it now, it can be easier), and 2) show people with examples of how to make custom indicators with the library, to extend it.

There is one design improvement that I am still tinkering with and could be a potential collaboration/contribution: how to abstract C# indicators so the usage syntax would mimic PineScript code as close as possible. There are so many useful and relevant strategies written in PineScript - and converting that code into C# is not elegant at all...

I've seen a lot of attempts at C# libraries where there's a tendency to code in the style of another language. Mostly this is because the author's trying to copy code from a non-C# source. The challenge here, is to maintain some of the benefits of C# and not to force it to be something it is not.

mihakralj commented 2 years ago
DaveSkender commented 2 years ago
  • I am using NaN because I used to code in Python and used to use NaN there... NaN is also 'safer' to use than Null in the current minefield of nullable/non-nullable C# ecosystem (Quantower is written with .NET 4.8 and C# 7 which don't support nullable types, but my core DLL is also compiled with .NET 6.0 and C# 10...)

Nullable is a minefield for sure. Though handling NaN seems weird too. I always viewed NaN as a very narrow use case for div/0 handling, like you see in Excel, but can see some utility here, as a workaround to avoid the whole nullable issue. I might consider it!

  • using benefits of C# to its fullest and not force it to be something different... Yeah, I hear you! (a friendly jab: all your indicators are crowded in a single static class Indicator - are you sure that's the best design choice in C#??? )

I hear you on this one. It’s a tradeoff I had made early on for end-use usability that mimicked Math and other Microsoft libraries. I view Indicator as the class of object and calculations as just flavors of it. It’s not been a problem so far, but plan to take another look at it in v2 as a different approach may be needed. I can see from your code that you’re quite good at Class design. I can learn a few things from you, for sure.

mihakralj commented 2 years ago

In my (amateur-ish) view, you won't be able to implement a real-time (update-able) indicators if you are sticking with static classes. Each (real-time) indicator needs its own mini-persistence to store buffers, past values and other bits of 'internal state' of indicator, so we don't need to re-calculate on each trigger. Unless I am missing something one static class with bunch of methods cannot maintain persistence (for each instance of each used indicator) across multiple calls.

I solved this by:

Here is an example how my approach works; the foreach is a simulation of real-time data streaming.

TSeries data = new();
SMA_Series sma9 = new(source: data, period: 9, useNaN: false);
SMA_Series sma25 = new(source: data, period: 25, useNaN: false);

foreach (var i in history) {
    data.Add(i.DateTime, i.Close);
    sma9.Add();
    sma25.Add();
}
DaveSkender commented 2 years ago

Yah, that's exactly what I was thinking too, for my v2. There's a persistence problem to solve. Your approach for managing it is solid and a quite good use of C# class design. It will work well with multi-threading too, if that's ever a use-case for you.

I think a static class can maintain persistence; but it really depends on how it's consumed. It's probably not the best approach and might be harder for end users to use. For static classes, it would look like:

var sma9 = quotes.GetSma(9);

// get new quote, add it
sma9.Add(quote);

Something like that.

mihakralj commented 2 years ago

There is another half-baked magic I am experimenting with: to make each indicator class an event publisher AND an event subscriber.

So, signals on a new data intake would flow from data source to indicators and all Add() statements would be obsolete.

Will ping you once I have a prototype with events. And once my JMA C# implementation is done, I will adapt it for your structure and issue a pull request into your library.

mihakralj commented 2 years ago

@DaveSkender I completed the (reversible) JMA indicator - and I am very pleased with results (see pic below). JMA is super-low-lag (like ZLEMA) but also suuuuper smooth with minimal overshooting.

If you use .NET interactive, and build the QuantLibrary.DLL (using QuantLibrary.csproj in /Algos), you can use Tests/quantlib.dib and see it work.

If you like what you see in Jurik Moving Average (JMA), I can retrofit it into your framework and initiate a pull request. Let me know.

image

DaveSkender commented 2 years ago

I’ll take a look when I have more time. Looks nice. Similar to ALMA.

Just Googling a bit, I see that JMA is out there as re-creations from the original. Do you have an original published recipe from Jurik? I tend to avoid adding community formulas since I can’t trace the provenance back to a reputable source. Also, adding unlicensed proprietary things could make it legally difficult for professional orgs to use the library. If the original author published it online or in a book/magazine, it’s fair game.

mihakralj commented 2 years ago

JMA is closer to Wilder's Parabolic MA and ZLEMA than to ALMA; you can see some comparison analysis of JMA here.

Jurik Research never officially discolsed JMA algo, and they stopped developing/selling the proprietary indicator DLL in 2021 (see here). The reverse-engineered JMA indicator that is 98% correct (compared to official JMA tests) was published on several places throughout the years, the most complete paper (that I followed) is here. But even this paper doesn't explain Mark Jurik's volatility bands well-enoguh, so I also followed another algo lifted from Jurik's implementation on MQL4 platform here.

The JMA algo has nothing super-proprietary in it - three smoothing stages, the first one being adaptive EMA, the second one reducing the lag and the third one compensating for the over-shooting. The approach to volatiliy in adaptive EMA is a bit unique (and required 65 bars for a warming-up period), so is the usage of a 'phase' parameter (how elastic/rigid is the curve).

JMA (made by Mark Jurik) and HMA (made by Alan Hull) are very comparable, with a minor difference that Alan's algo for HMA is super-simple (12 lines of core code) and published by Alan, while Mark's algo is rather complicated (90+ lines of core code) and Mark never bothered to publish/explain his approach.

As nobody is monetizing JMA anymore, I find it fair game to preserve this rather nifty algo - especially as it is as-good-as (and sometimes superior to) HMA, ZLEMA, ALMA and other modern approaches.

DaveSkender commented 2 years ago

Sounds like Jurik retired! He should have published a book of formulas on his way out to get some residual value out of them. Lost opportunity for him 🙁

mihakralj commented 2 years ago

@DaveSkender I packaged and published the proto-version of QuantLib on NuGet. I did that just so you (and others) can play with a .NET interactive notebook that includes MA comparison tests.

Let me know if this .dib notebook works for you (VS.Code should have no problem running it) - and what is your opinion. For me, the 6th plot (Chirp) is particularly good exposing good vs. terrible indicators. (blue=input, green=JMA, red=EMA). You can play with different indicators - see comments in the second cell in the notebook image

mihakralj commented 2 years ago

@DaveSkender I'd like your opinion on the updated class model. Here is an example of how MACD could be constructed; data signaling happens through events and pub-sub model. Below is a functional construct of MACDwith the new class model

var history = await Yahoo.GetHistoricalAsync("TQQQ", DateTime.Today.AddDays(-230), DateTime.Now, Period.Daily);
TSeries data = new();
EMA_Series slow = new(data,26);
EMA_Series fast = new(data,12);
SUB_Series macd = new(fast,slow);
EMA_Series signal = new(macd,9);
foreach (var i in history) data.Add((i.DateTime,  (double)i.Close));
DaveSkender commented 2 years ago

MACD is a good use case for your design since it produces several elements (though, you're the missing histogram part). Functionally, from a programming standpoint, this can work well for incremental calculations since you're really just using Add thereafter, which is nice.

I think the challenge here will be for the user. You may be asking them to do too much if they'd need to do all six of those steps in proper sequence every time they want to do a different flavored MACD. It would be more usable if you can get the Series parts into a single interface.

var history = [..];
TSeries macdBase = InitMacd(history,26,12,9);

Then, just increment new stuff with macdBase.Add(..). Though, I'm not sure how you'd get the results out of that. It's a bit of a confusing problem. I've not entirely thought it through yet. See DaveSkender/Stock.Indicators#216.

mihakralj commented 2 years ago

you ALMOST got it - instead of class TSeries, MACD should be packaged into its own class that returns three TSeries: macd, signal and diff (histogram). And when a user looks inside MACD_Series class, it is rather clean and logical, sequencing EMAs and SUBs one after another...

Probably time to start writing some .NET interactive notebooks with usage patterns and how-tos!

DaveSkender commented 2 years ago

I think for my library, I’ll probably be able to keep my IResult based class output intact and do all static elements, with something like:

var quotes = [..];
MacdBase macdBase = quotes.InitMacd(12,26,9);
IEnumerable<MacdResult> results = macdBase.Results();

// to increment
macdBase.Update(quote);

And use an interface design pattern for the base. Base would need the .Update(quote) part and results would just update automatically without need for new extraction.

Ref: https://github.com/DaveSkender/Stock.Indicators/discussions/216#discussioncomment-2444760

This approach will allow me to keep a simple base usage pattern for non-incremental users with a common result class usage.

var quotes = [..];
IEnumerable<MacdResult> results = quotes.GetMacd(12,26,9);
mihakralj commented 2 years ago

I just published the updated version 1.0.4 that is able to do that too for non-incremental users. see the last example in getting_started.ipynb

DaveSkender commented 2 years ago

Nice. If you want a real stress test of your design, try to implement Ichimoku Cloud. It uses more than one candle value, has multiple output series, and projects into the future.

p.s. I've never seen Plotly.NET before. It looks like a nice tool for certain use cases. I've added it to my list of data and visualization options.