Closed ikorman closed 4 years ago
That does not seem right. Since MACD uses 4 full series EMA calculations, I'd expect it to be approximately 4× slower than EMA. If this is one contiguous history
series of Quotes, be sure that you are not running it on every date in that series; you'd only need to run it one (see #88).
I load historical data and then calculate various indicators for further processing, here is sample code I'm using for it:
// RSI
sw = Stopwatch.StartNew();
var rsi = Indicator.GetRsi(quotes, 14);
sw.Stop();
Console.WriteLine($"RSI: {sw.ElapsedMilliseconds}");
this.Add("RSI", rsi.Cast<ResultBase>().ToList());
// MACD
sw = Stopwatch.StartNew();
var macd = Indicator.GetMacd(quotes, 12, 26, 9);
sw.Stop();
Console.WriteLine($"MACD: {sw.ElapsedMilliseconds}");
var macdRes = macd.Cast<ResultBase>().ToList();
this.Add("MACD", macdRes.Cast<ResultBase>().ToList());
var lastmacd = macdRes.Last();
// Bolinger bands
sw = Stopwatch.StartNew();
var bb = Indicator.GetBollingerBands(quotes);
sw.Stop();
Console.WriteLine($"BB: {sw.ElapsedMilliseconds}");
var bbRes = bb.Cast<ResultBase>().ToList();
this.Add("BB", bbRes);
var lastBBRes = bbRes.Last();
// Ichimoku
sw = Stopwatch.StartNew();
var ichimoku = Indicator.GetIchimoku(quotes);
sw.Stop();
Console.WriteLine($"Ichimoku: {sw.ElapsedMilliseconds}");
var ichimokuRes = ichimoku.Cast<ResultBase>().ToList();
this.Add("Ichimoku", ichimokuRes);
var lastIchimokuRes = ichimokuRes.Last();
And here are the times from it (in ms). Source is always the same and it is processed only once.
Heikin ashi: 55 RSI: 61 MACD: 10910 BB: 12435 Ichimoku: 44909
Thx Ivan
Ichimoku has the most complex calculations, so I do expect that to be the slowest; however, 44 seconds still seems very slow. I'd expect it to be not more than 4× slower than RSI. I'll be doing some performance benchmarking soon (AB#727), so I'll be able to get a good read on what to expect for each indicator.
For now, you can get a sense of what to expect from the unit test results in the build. Unit tests are only using a history with 500 quotes, so you'll have to extrapolate a bit for larger data sets.
I'll also take another look at performance tuning on some of these more complex indicators. I may explore some parallelism or other async methods since some of these internal computations do not need to be serialized. If you have any thoughts or want to help, let me know.
I have take a look on MACD code and most time is burned where you search for slow & fast EMA with Where(...).FirstOrDefault(). I believe that better approach would be to go from fact that array of history and ema's are the same length with same index base, hence the code could look something like this:
public static IEnumerable<MacdResult> GetMacd(IEnumerable<Quote> history, int fastPeriod = 12, int slowPeriod = 26, int signalPeriod = 9)
{
// clean quotes
Cleaners.PrepareHistory(history);
// check parameters
ValidateMacd(history, fastPeriod, slowPeriod, signalPeriod);
// initialize
var emaFast = GetEma(history, fastPeriod).ToList();
var emaSlow = GetEma(history, slowPeriod).ToList();
List<BasicData> emaDiff = new List<BasicData>();
List<MacdResult> results = new List<MacdResult>();
var a = history.ToList();
for (int c = 0; c < a.Count; c++)
//foreach (Quote h in history)
{
var h = a[c];
//EmaResult df = emaFast.Where(x => x.Date == h.Date).FirstOrDefault();
//EmaResult ds = emaSlow.Where(x => x.Date == h.Date).FirstOrDefault();
EmaResult df = emaFast[c];
EmaResult ds = emaSlow[c];
... ...
In this way you could avoid costly search and instead you can do direct lookup via index. Does it make sense?
Ivan
As an old database guy, it always scares me to use an ordinal position in a collection over a unique key value. I suspect, under the hood, LINQ may be doing something like that anyway, so it might not help much. With that said, it could potentially be faster and if that can be done reliably, it might be worth exploring.
If anything, I should probably key off of Index
instead of Date
as I believe integers resolve faster.
I think I'll implement a performance benchmarking test rig first so I can do a more reliable comparative analysis of changes. Feel free to tinker with this in your own Fork until I get a chance to look at it closer.
If you are sticking to List
public static IEnumerable<MacdResult> GetMacdFast(IEnumerable<Quote> history, int fastPeriod = 12, int slowPeriod = 26, int signalPeriod = 9)
{
// clean quotes
Cleaners.PrepareHistory(history);
// check parameters
ValidateMacd(history, fastPeriod, slowPeriod, signalPeriod);
// initialize
List<EmaResult> emaFast = GetEma(history, fastPeriod).ToList();
List<EmaResult> emaSlow = GetEma(history, slowPeriod).ToList();
List<BasicData> emaDiff = new List<BasicData>();
List<MacdResult> results = new List<MacdResult>();
var historyList = history.ToList();
for (int idx = 0; idx < historyList.Count; idx++)
{
var h = historyList[idx];
EmaResult df = emaFast[idx];
EmaResult ds = emaSlow[idx];
MacdResult result = new MacdResult
{
Index = (int)h.Index,
Date = h.Date
};
if (df?.Ema != null && ds?.Ema != null)
{
decimal macd = (decimal)df.Ema - (decimal)ds.Ema;
result.Macd = macd;
// temp data for interim EMA of macd
BasicData diff = new BasicData
{
Date = h.Date,
Index = h.Index - slowPeriod,
Value = macd
};
emaDiff.Add(diff);
}
results.Add(result);
}
IEnumerable<EmaResult> emaSignal = CalcEma(emaDiff, signalPeriod);
// add signal, histogram to result
foreach (MacdResult r in results)
{
EmaResult ds = emaSignal.Where(x => x.Date == r.Date).FirstOrDefault();
if (ds?.Ema == null)
{
continue;
}
r.Signal = ds.Ema;
r.Histogram = r.Macd - r.Signal;
}
return results;
}
Results on sample set were:
MACD: 23,572 sec MACD fast: 7,892 sec
Did check and compare outputs and could not find difference in results. I tried to solve Where() search at the end but as it has different length of data, it was bit too much for me at this stage. Paralleling things with parallel and similar would help but only to certain extent. For example, I was playing with MACD calculation from some other library (https://github.com/anilca/NetTrader.Indicator/blob/master/NetTrader.Indicator/MACD.cs) and it was running under second (there were some other issues that made me look forward, like data accuracy in general)
I like the approach of pine scripting on TradingView where all data is set of arrays with same length, which makes calculations quite easy to follow.
BR Ivan
You're right, an all array
approach is definitely faster, but it's a tradeoff with usability overall. I used to use the array-driven ta-lib for a long time, but had to do a lot of extra coding to manipulate the data to make it usable. TA-Lib has not been updated for 10+ years and was holding me back from moving to newer tech, so I had to move on. There were not many pure modern .NET technical indicator packages, so I started building this one to be a little friendlier for trading systems and charting use cases.
You've got me thinking about arrays
now, so I'll definitely be looking to test incorporating that in some form internally, but will still try to keep the public utility the same. Thanks for helping!
No problem. Btw I managed to resolve last Where() in GetMacd and now whole thing executes in only 100ms. From class/function calling point of view nothing changed (I just called it GetMacdFast for sake of testing), so those type of changes should not influence their usability if somebody already use them in their code, it will be just quicker
public static IEnumerable<MacdResult> GetMacdFast(IEnumerable<Quote> history, int fastPeriod = 12, int slowPeriod = 26, int signalPeriod = 9)
{
// clean quotes
Cleaners.PrepareHistory(history);
// check parameters
ValidateMacd(history, fastPeriod, slowPeriod, signalPeriod);
// initialize
List<EmaResult> emaFast = GetEma(history, fastPeriod).ToList();
List<EmaResult> emaSlow = GetEma(history, slowPeriod).ToList();
List<BasicData> emaDiff = new List<BasicData>();
List<MacdResult> results = new List<MacdResult>();
var historyList = history.ToList();
int skip = 0; // used to count how many null entries we will have to insert into emaDiff array, to make it the same length
for (int idx = 0; idx < historyList.Count; idx++)
{
var h = historyList[idx];
EmaResult df = emaFast[idx];
EmaResult ds = emaSlow[idx];
MacdResult result = new MacdResult
{
Index = (int)h.Index,
Date = h.Date
};
if (df?.Ema != null && ds?.Ema != null)
{
decimal macd = (decimal)df.Ema - (decimal)ds.Ema;
result.Macd = macd;
// temp data for interim EMA of macd
BasicData diff = new BasicData
{
Date = h.Date,
Index = h.Index - slowPeriod,
Value = macd
};
emaDiff.Add(diff);
}
else
skip++;
results.Add(result);
}
IEnumerable<EmaResult> emaSignal = CalcEma(emaDiff, signalPeriod);
var emaSignalList = emaSignal.ToList();
// make emaSignal array the same lenght as macdResults, padding array beginning with nulls
for (int c = 0; c < skip; c++)
emaSignalList.Insert(0, null);
// add signal, histogram to result
var macdResultList = results.ToList();
for (int c = 0; c < macdResultList.Count; c++)
{
var r = results[c];
EmaResult ds = emaSignalList[c];
if (ds?.Ema == null)
{
continue;
}
r.Signal = ds.Ema;
r.Histogram = r.Macd - r.Signal;
}
return results;
}
BR Ivan
Forget to mention, how I landed here and why the speed is important for me: I am evaluating possibility to develop automated bot for scalping on cryptos and at the moment I'm running different strategies through historical data of 3 months (1m, 3m and 5m interval). Therefore each time I change something in the strategy, I have to reload the file and calculate all the indicators as preparation for simulation on historical data - and that preparation took couple of minutes due to above topic, so I took some time to look into it :)
Like your work on the library, amount of indicators and how it's easy to implement it :)
Btw. how do you confirm correctness of the data, what is reference for the comparison?
Very nice! I'll reference this when making changes and will likely take this approach to speed things up.
I see a lot of interesting use cases from users that are doing a lot of different things, ranging from massive bulk operations, machine learning, Forex, Crypto, etc. Personally, I'm mostly using it for a more traditional trading strategy system and for charting. It's been really cool to see what everyone is doing ... from all over the world.
Regarding correctness testing, there are a few things I'm doing:
Out of curiosity, what framework are you targeting? So far I'm not seeing the same slowness for MACD on the benchmarks. I know .NET Core tends to work better than the other frameworks that we include in the package; was wondering if you're using an older framework.
Good point. As my app is GUI based, I'm targeting .net framework 4.7.2. At first I have used your precompiled library from NuGet (there I had 10 s) and then I used source code that I compiled together with my app (there it was 23 sec), so there is some difference depending on the target.
Regarding you benchmarks, I'm not sure if simple scaling will work for MACD as it's number of calculations rises exponentially with history size (e.g. for hist size 10 it has to loop 10 times through 10 EMA list, while for 100 it has to work will 100 EMA list size - 100 vs 10,000 in worst theoretical case).
this change was implemented in v0.10.10
This Issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new Issue for related bugs.
Hi
I'm playing and testing this great library and so far I'm very pleased with it. I did noticed one thing dough - MACD calculation is rather slow. Here is an example:
Are those expected times?
Ivan