askmike / gekko

A bitcoin trading bot written in node - https://gekko.wizb.it/
MIT License
10.06k stars 3.95k forks source link

GPU support #2490

Closed adradr closed 5 years ago

adradr commented 5 years ago

I'm submitting a ... [ ] bug report [x ] question about the decisions made in the repository

Dear @askmike !

I have been experimenting with your framework, it really helps me a lot in terms of strategy building. Thanks for creating it :)

The only limitation I have encountered is speed, as all backtesting computations are running on CPU. I have been exploring the options and stumbled upon gpu.js which would theoretically enable gekko to be implemented for GPU computation.

Do you know this library or if yes, what's your stand on it? Are there plans for incorporating some kind of GPU support? Or am I missing some forked version, which already written for GPU? (I havent found anything here or on the forum)

Thanks πŸ‘‹

Link: gpu.js Github

askmike commented 5 years ago

Maybe we should first figure out the bottleneck when running backtests. If the bottleneck is reading candles from disk (from the sqlite db) the GPU won't help us. Calculating indicators is offloaded to TA-lib or Tulip (depending on the strategy), these are c libraries. Maybe that's the slow part?

In any case, depending on what you are trying to do we might be able to optimize things already: if you want to run a ton of backtests over the same data (with tweaks to the strategy parameters) we can simple cache candles (only calculate them once).

What is your use case?

Regards, Mike

On Fri, 31 Aug 2018, 22:08 adradr, notifications@github.com wrote:

I'm submitting a ... [ ] bug report [x ] question about the decisions made in the repository

Dear @askmike https://github.com/askmike !

I have been experimenting with your framework, it really helps me a lot in terms of strategy building. Thanks for creating it :)

The only limitation I have encountered is speed, as all backtesting computations are running on CPU. I have been exploring the options and stumbled upon gpu.js which would theoretically enable gekko to be implemented for GPU computation.

Do you know this library or if yes, what's your stand on it? Are there plans for incorporating some kind of GPU support? Or am I missing some forked version, which already written for GPU? (I havent found anything here or on the forum)

Thanks πŸ‘‹

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/askmike/gekko/issues/2490, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7MD1WeXVnhZk6fXYOhL0ORspc9LAsDks5uWUNRgaJpZM4WVSD5 .

adradr commented 5 years ago

My main use case is strategy parameter optimization, so I am using the same candle data over and over in a few variations (max 3 candle type, and mostly one ccy pair at a time with 2+ yrs history) But candle data could be cached, thats a great idea.

I am using Gekko Backtester tools from @xFFFFF which I think it loads up multiple node instances and loads the candle data for each of them. However the most time is not spent with candle loading but the indicator calculation.

I know you are developing a kinda SaaS like version of the gekko tool (Gekko Plus), so that is probably the main and most focus for you know, but I think GPU could be a great idea to think about in a more strategy focused manner because during my research I havent found any working GPU backtesting frameworks with crypto focus.

askmike commented 5 years ago

What kind of indicators? Gekko native ones or from a low level lib? There are a ton of other things we can do such as calculating all the indicators only once and exposing them one by one to the strategy.

On Fri, 31 Aug 2018, 22:37 adradr, notifications@github.com wrote:

My main use case is strategy parameter optimization, so I am using the same candle data over and over in a few variations (max 3 candle type, and mostly one ccy pair at a time with 2+ yrs history) But candle data could be cached, thats a great idea.

I am using Gekko Backtester tools from @xFFFFF https://github.com/xFFFFF which I think it loads up multiple node instances and loads the candle data for each of them. However the most time is not spent with candle loading but the indicator calculation.

I know you are developing a kinda SaaS like version of the gekko tool (Gekko Plus), so that is probably the main and most focus for you know, but I think GPU could be a great idea to think about in a more strategy focused manner because during my research I havent found any working GPU backtesting frameworks with crypto focus.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/askmike/gekko/issues/2490#issuecomment-417684187, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7MD1ofVOMvkg2B4bxBoUDQXE8GFnHgks5uWUosgaJpZM4WVSD5 .

adradr commented 5 years ago

Currently I am experimenting with the native gekko indicators, mainly MACD.

My current backtest setup is the following:

askmike commented 5 years ago

Before we jump from javascript calculations to GPU shaders, why not try to leverage the c libraries already integrates with?

I think it's a great idea to eventually move towards. But I am afraid I lack the skills (my opengl is rusty) for this. I would love a PR though!

On Fri, 31 Aug 2018, 23:01 adradr, notifications@github.com wrote:

Currently I am experimenting with the native gekko indicators, mainly MACD.

My current backtest setup is the following:

  • 7200 MACD variations (short, long, signal, thresholds and persistance)
  • 3 candle types (15, 60, 1440)
  • 1 dataset (polo:usdt:eth) with 2.5 yrs of history
  • Running on 24 threads.
  • its close to 4 days in runtime and its at 28% of total

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/askmike/gekko/issues/2490#issuecomment-417691858, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7MD2Qi3P9FREq0Rm4lT37x65JmhwkHks5uWU_IgaJpZM4WVSD5 .

mark-sch commented 5 years ago

I think main performance limitations are due to async operations/checks inside portions of gekko using the javascript setTimeout operation. So it is not a slow code execution itself, it is the forced advice to wait before certain code executions. It is by design - a fast GPU computation will not improve this significantly. Having the 1Min. candle limitation in mind this makes all sense, it does not matter to wait 1sec before triggering certain calculations in this scenario. @askmike Did you experiment same refactoring with the async/promise pattern already? I tested this approach on a new coinmarketcap market data importer for gekko and had very fast result compared to "tick" triggered code. Many API thing have rate limits so it makes absolutely sense to "wait" for things - impossible to speed this up.

Example, candleLoader.js on handleCandles method. It has a 100ms timeout to check if things are done:

const handleCandles = (err, data) => { if(err) { console.error(err); util.die('Encountered an error..') }

if(.size(data) && .last(data).start >= toUnix || iterator.from.unix() >= toUnix) DONE = true;

batcher.write(data); batcher.flush();

if(DONE) { reader.close();

setTimeout(doneFn, 100);

} else { shiftIterator(); getBatch(); } }

askmike commented 5 years ago

@mark-sch the candle loader is only a simple utility used in the API for retrieving candles. Inside a backtest no timeouts are used (if there are that's definitely a bug). The code run in a backtest is designed to be fast and not take up any unneeded time.

That said: if there is a timeout somewhere that's definitely a bug.

adradr commented 5 years ago

If i understand it correctly then the backtesting calculation is totally optimzed with no internal delays, so the bottle-neck for one specific backtest calculation is the processor speed of that particular core that is running it. As I see, one backtesting is run by a single core. Multicore is only triggered when I am using gekkoga for instance and it launches up multiple backtest api requests.

How is it possible or viable to create gpu multi-thread operation, so one backtesting request can exploit multiple cores and / or by using tools like gekkoga users would be able to exploit multiple thousand cores for computations versus the current 4-8-12-etc configurations that most computers include.

Of course you can shoot up a cloud instance with hundreds of cpu cores but thats still not thousand cores and taking in account the cost of it makes them unviable. On the other hand I think most of us have multiple spare gpu-s from previous mining operations and would be a great to exploit them in order to accelerate algorithm fitting.

Looking forward your opinion!

blastbeng commented 5 years ago

I'm in for this, using a GPU over the CPU for backtesing should definitively increase the performance

i really want to achieve this, but i'm a node noob, was trying to investigate in the gekko code with no success for now.

Does anyone knows how the code should be modified to integrate gpu.js for backtesting?

vqalex commented 5 years ago

Currently you can’t use gpu.js, because it is supporting only webgl, but possibly in the future. It would defenetly would benefit to unload some of the calculations to GPUs in indicators/strategies, and would be really easy to implement.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. If you feel this is very a important issue please reach out the maintainer of this project directly via e-mail: gekko at mvr dot me.