polakowo / vectorbt

Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
https://vectorbt.dev
Other
4.39k stars 620 forks source link

Question regarding streaming of data, signals generation, and general actions #290

Closed auraxchan closed 2 years ago

auraxchan commented 2 years ago

Hello,

I'm considering using vectorbt for backtesting and optimization of strategies, I have a few questions which I couldn't find any specific answer on in the documentation.

  1. Assume that my 'indicators' are re-painting past events, how that can be approached in vectorbt? Is there a way to stream the tick data as it was streamed live, to tackle anomalies, or is the signal generation is 'post-processing' style (backward processing, 'future prediction, etc'), or is it forward-in-time, where every tick is processed separately and by that, can simulate 'live trading', that is the major difference between fake/real backtesting IMHO, instead of populating all indicators values and re-processing for signals, stream it 'live'.

  2. Control over trade execution time is possible? for example, assuming I'm using 'close' as an indicator source, is it possible to define that trade will be performed on the next 'open' price? without adding 'artificial lag' to the scenario?

  3. Optimization - assume that I want to write my own Sortino/Sharpe style optimization (for drawdown etc), how complex is that?

Thank you!

polakowo commented 2 years ago

Hi @auraxchan

  1. Since most functions are JIT compiled with Numba, vectorbt uses regular loops to traverse through time-series data; that is, only past events are available at each point in time. By default, vectorbt gives you building blocks acting as micro-pipelines where your data is traversed and some indicator is written. You can then use many such pipelines to generate signals. Now, if you want to use event-driven approach and encapsulate the entire logic into each time step (which is encouraged), there is this function, which lets you run your custom functions step by step, traversing data only once. See this notebook section for a complete example.
  2. There is no built-in order management; vectorbt is a raw processing engine, it executes whatever is given. If you issue an order at time X with price of infinity, vectorbt will gladly execute that (it actually will throw an error, but anyway). The function I mentioned above gives you the possibility to implement arbitrary logic.
  3. After the simulation is complete, you get a portfolio object offering a lot of useful information such as running portfolio value and returns. You can then define your own metric that builds upon this information. The result of any optimization is usually a simple pandas Series with a result per column, such as max daily return defined simply as pf.returns().max(). Of course you can avoid retrospective analysis and compute your metrics during the simulation, there are many ways of achieving the same thing. This isn't complex at all.
auraxchan commented 2 years ago

@polakowo First, thank you for the detailed answer, that is really appreciated.

Second, vectorbt is so impressive, I've quite fast managed to simulate the scenario I'm challenged with, and results are as expected: image

Defining my own indicators and calling them through vectorbt.indicators.factory is really simple.

I'm going to try now the method you mentioned, I understand plotting is more challenging with streaming the data tick by tick obviously, but I'm actually looking for those 'ghost signals'.

Thank you!

auraxchan commented 2 years ago

So continuing my first question regarding the broadcasting of tick data into the portfolio (assuming that I'm using the right terms)

  1. I first created a scenario with regular signals entries and exists, which works perfectly as expected
  2. To simulate a real trading scenario I've created a small script after pulling historical data etc, I've taken this from the documentation, assuming that I would like to use a indicator_above/below signals method how do I implement it with vbt.Portfolio.from_order_func?
import numpy as np
import pandas as pd
from datetime import datetime
import talib
from numba import njit

import vectorbt as vbt
from vectorbt.utils.colors import adjust_opacity
from vectorbt.utils.enum_ import map_enum_fields
from vectorbt.base.reshape_fns import broadcast, flex_select_auto_nb, to_2d_array
from vectorbt.portfolio.enums import SizeType, Direction, NoOrder, OrderStatus, OrderSide
from vectorbt.portfolio import nb
direction = ['longonly']  # per column
fees = 0.01  # per frame
price = ohlcv['Close']
size = ohlcv['Close']
@njit
def order_func_nb(c, size, direction, fees):
    return nb.order_nb(
        price=c.close[c.i, c.col],
        size=size[c.i],
        direction=direction[c.col],
        fees=fees
)

direction_num = map_enum_fields(direction, Direction)

pf.orders.records_readable

pf = vbt.Portfolio.from_order_func(
        price,
        order_func_nb,
        np.asarray(size), np.asarray(direction_num), fees
    )
  1. What will be the best way to approach hyper optimization with this method? I couldn't find a good example in the documentation to follow, but I might need to dig further, apologize in advance for that ;)
  2. Is broadcasting required?
  3. In my scenario I'm wrapping existing indicators and using above/below to generate signals with crossover
  4. pf.stats() returned (using the example above):
    Start                         2021-11-24 03:54:00+00:00
    End                           2021-12-04 03:53:00+00:00
    Period                                 10 days 00:00:00
    Start Value                                       100.0
    End Value                                     87.005706
    Total Return [%]                             -12.994294
    Benchmark Return [%]                         -12.356125
    Max Gross Exposure [%]                            100.0
    Total Fees Paid                                0.990099
    Max Drawdown [%]                              37.882157
    Max Drawdown Duration                   8 days 18:08:00
    Total Trades                                          1
    Total Closed Trades                                   0
    Total Open Trades                                     1
    Open Trade PnL                               -12.994294
    Win Rate [%]                                        NaN
    Best Trade [%]                                      NaN
    Worst Trade [%]                                     NaN
    Avg Winning Trade [%]                               NaN
    Avg Losing Trade [%]                                NaN
    Avg Winning Trade Duration                          NaT
    Avg Losing Trade Duration                           NaT
    Profit Factor                                       NaN
    Expectancy                                          NaN
    Sharpe Ratio                                  -1.413227
    Calmar Ratio                                  -2.623357
    Omega Ratio                                    0.994524
    Sortino Ratio                                 -2.013951
    Name: MANA/USDT, dtype: object

Thank you!

polakowo commented 2 years ago

@auraxchan in from_order_func you can't use indicator_above etc. methods because now you're working with a limited view data. You can either 1) generate all indicator values prior to simulation and pass them to your order_func as an argument and then in order_func utilize that data (similar to how you passed direction and co), or 2) implement your indicator from ground up directly in the simulation function, similar to how you would do this in backtrader and many other backtesters. Just create an empty array for your indicator values, write it as simulation goes, and perform some actions on it to generate orders. There is a good example of an order function that periodically searches for best weights to rebalance a portfolio: https://nbviewer.org/github/polakowo/vectorbt/blob/master/examples/PortfolioOptimization.ipynb. Such functions may look complex at first but this is just NumPy + regular loops, all pretty easy to write (at least much easier than writing a custom backtester using C but enjoying almost the same performance).

quantumpacket commented 2 years ago

@polakowo I'm looking into doing live trading by getting the current signal based on new data from a websockets stream. I hoped to look at the function you linked to in the 2nd comment of this thread, but it's 404. Can you update the comment to point to the location where the function is now located? Thank you. :)

polakowo commented 2 years ago

Hi @quantumpacket, https://vectorbt.dev/api/portfolio/base/#vectorbt.portfolio.base.Portfolio.from_order_func

auraxchan commented 2 years ago

Thank you @polakowo this has been an amazing tool to use, right now I'm using vbt to generate signals and trading successfully

I have a quick question (actually have a bunch :)) I'm trying to filter out results that are < win_rate but couldn't find a way how to remove them from the portfolio index.

I'm about to rewrite some indicators with numba njit, would it still be possible to use param_product? I found this was the only method to transfer multiple parameters to functions.

Still, in testing phases, I'm also using vbt for 'market seek' been pretty awesome, there are some very interesting markets out there I wouldn't even notice without deep diving using vbt.

You're knowledge/comments on pandas/numba performance are gold, Right now I'm overloading a 12C/16GB box for few instances and testing notebook, very intensive with memory, ray is performing well for small tasks but if you really want to 'break out' you must go NumPy/numba as it seems.

Cheers and thanks!

auraxchan commented 2 years ago

Solved the win_rate issue with bands_opt1.deep_getattr('total_profit')[bands_opt1.trades.win_rate() == 1.000000000]

Thanks.