scrtlabs / catalyst

An Algorithmic Trading Library for Crypto-Assets in Python
http://enigma.co
Apache License 2.0
2.48k stars 723 forks source link

Support more frequencies in live and backtest #319

Open kooomix opened 6 years ago

kooomix commented 6 years ago

Hi,

I have been experimenting paper trading lately.

I've also noticed that in the documentation (https://enigma.co/catalyst/live-trading.html) the "frequency" argument is not even mentioned in the list of arguments for paper/live trading.

Any known issue on that matter?

Thanks.

lenak25 commented 6 years ago

Hi @kooomix ,

In live/paper trading only the minute frequency is supported. Thanks for reporting this, we will update our documentation (we added this to the API doc but it is should be added to the tutorial as well).

Thanks, Lena

kooomix commented 6 years ago

Thanks, Lena. Is there any plan soon to have the live/paper support also daily?

lenak25 commented 6 years ago

Hi @kooomix ,

Actually we will be happy to hear your feedback. We were not sure that such a low frequency is required in live mode.

Lena

kooomix commented 6 years ago

Actually, I was developing my algo first on daily resolution as it was easier to handle less data for longer period of time.. And wanted to check it also on trading to compare to back-testing... But the plan is eventually to run in much higher frequency...

On Tue, 1 May 2018 at 16:11 Lena notifications@github.com wrote:

Hi @kooomix https://github.com/kooomix ,

Actually we will be happy to hear your feedback. We were not sure that such a low frequency is required in live mode.

Lena

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/319#issuecomment-385667307, or mute the thread https://github.com/notifications/unsubscribe-auth/AZmz-8ydf4JF3G9WMSQ7Yr8hoChI4rkDks5tuF8LgaJpZM4TtydB .

kooomix commented 6 years ago

Having said that - hourly or 4-hours frequency will be really helpful.

On Tue, 1 May 2018 at 16:23 Eran Madar emadar@gmail.com wrote:

Actually, I was developing my algo first on daily resolution as it was easier to handle less data for longer period of time.. And wanted to check it also on trading to compare to back-testing... But the plan is eventually to run in much higher frequency...

On Tue, 1 May 2018 at 16:11 Lena notifications@github.com wrote:

Hi @kooomix https://github.com/kooomix ,

Actually we will be happy to hear your feedback. We were not sure that such a low frequency is required in live mode.

Lena

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/319#issuecomment-385667307, or mute the thread https://github.com/notifications/unsubscribe-auth/AZmz-8ydf4JF3G9WMSQ7Yr8hoChI4rkDks5tuF8LgaJpZM4TtydB .

lenak25 commented 6 years ago

Thanks for the feedback. We will add this to our future features list. In the meantime, you could do something like this to run your code at a lower resolution (30 minutes, in this example):

def initialize(context):
     context.i = 0
def handle_data(context, data):
     context.i = context.i +1
     if context.i % 30:
            return       # does nothing 29 out of 30 times
      # your code here that will be executed every 30 minutes

Another approach is using the data.current_dt which holds the current time stamp and filter by it.

kooomix commented 6 years ago

Yeah, this is the code i'm using to play with the frequency.

Thanks a lot!

On Tue, 1 May 2018 at 16:42 Lena notifications@github.com wrote:

Thanks for the feedback. We will add this to our future features list. In the meantime, you could do something like this to run your code at a lower resolution (30 minutes, in this example):

def initialize(context): context.i = 0 def handle_data(context, data): context.i = context.i +1 if context.i % 30: return # does nothing 29 out of 30 times

your code here that will be executed every 30 minutes

Another approach is using the data.current_dt which holds the current time stamp and filter by it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/319#issuecomment-385673406, or mute the thread https://github.com/notifications/unsubscribe-auth/AZmz-2h6u53_lWlziWj2mBYY2OrdIa1qks5tuGZHgaJpZM4TtydB .

kooomix commented 6 years ago

I have another question on that matter...

Assuming I'm using the minute frequency, and I want to trade every one hour. I have a problem in which data is being loaded to the Handle_Data every minute, even though I need it just every one hour. This causes a major performance issue when trying to backtest many symbols for longer period of times.

Is there any recommended solution/approach to handle this issue?

Thanks.

lenak25 commented 6 years ago

Hi @kooomix , you mean that the workaround suggested above isn't satisfying in terms of performance? Thanks for the feedback, the only solution is to support more frequencies (I've edited the issue subject).

kooomix commented 6 years ago

No, it's not. In backtesting, there is huge difference in performance comparing "daily" to minutely with making trades every 1440 minutes (=day). I guess the reason is that the data is being loaded every minute, even though it is not necessary.

So I think that in addition of supporting more frequency, I would also add an option to get the pricing data "on-demand" and not load it by default.

mozartAlpha commented 6 years ago

@kooomix @lenak25 i used your example code ,found the test result is error.i think the reason is that the data is being loaded every minute.the line output is square.

EmbarAlmog commented 6 years ago

Hi @kooomix, there is an api function that can fit your needs: https://enigma.co/catalyst/appendix.html#scheduling-functions You can use schedule_function for scheduling a function to be run at the intervals you desire. Here is a small example for scheduling the function rebalance to be called each hour:

for i in range(0, 12):
     schedule_function(rebalance,
                       date_rules.every_day(),
                       time_rules.market_open(hours=i, minutes=1))

     schedule_function(rebalance,
                       date_rules.every_day(),
                       time_rules.market_close(hours=i, minutes=59))

The function to be called (rebalance in the example above) gets as arguments context and data. If you wish to replace it with a call to handle_data it is required to remove the setting handle_data=handle_data from the run_algorithm definition.

kooomix commented 6 years ago

Thanks @EmbarAlmog

The issue is not with calling functions by certain interval, but the fact that even though I want to trade only once an hour, I have to use "minute" frequency and run my relevant functions every 60 minutes. The handle_data function, even if is empty, is very time consuming as it loads the data to the data object in every iteration.

Anyway, meanwhile, I've built a workaround using external hourly data in order to be able run backtests much faster. As I stated above, supporting hourly frequency as well as "on-demand" data would be very helpful for me.

lenak25 commented 6 years ago

Hi @kooomix , the workaround suggested by @embaral shouldn't call your handle_data every minute but in the frequency you have defined (1 hour for example), which might have an affect on the speed your backtesting is running. The data itself will still come in minute frequency which can be easily re-sampled to hourly.

kooomix commented 6 years ago

algorithm needs to know which function to run as "handle_data" for the algorithm. this function (usually called "handle_data") will always run as the frequency of the algorithm, on our case every minute. Therefore, handle_data will run every minute automatically without any way to control it, and therefore data is loaded every minute.

This is how I understand the system works, correct me if I'm wrong... Either way, the time is mainly consumed by the data loaded every minute without any need.

EmbarAlmog commented 6 years ago

The schedule function can schedule any function that you want, as long as its parameters are context and data. If you don't want the handle_data to be called every minute, when calling run_algorithm you should not set handle_data=handle_data, simplify delete that line. If you use the example I posed above, the scheduled function will be called every hour ONLY.

kooomix commented 6 years ago

Oh, I didn't realized the handle_data parameter is optional.. :) I've just ran your suggested solution comparing to including setting handle_data and using minute counter to capture every hour and got the same running time.. :(

so I guess the data still comes every minute as Lena stated which causes the loss of time...

traeper commented 6 years ago

@kooomix me, too :( ...

did you solve this issue? I am trying to find that point.

Maybe a data aggregation point by frequency argument..?