polakowo / vectorbt

Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
https://vectorbt.dev
Other
3.96k stars 580 forks source link

Difference between `returns`, `daily_returns`, `asset_returns` and `log_returns` in portfolio #675

Open stucash opened 7 months ago

stucash commented 7 months ago

Thanks for making available such a powerful tool to the open source community.

I've just started finding my way around of vectorbt and I noticed these four different return types and I am a bit baffled by the results they generated when I compared them with each other.

I am on vectorbt free, not vectorbt[full] free or vectorbt pro. I am using python 3.10 on Debian 11 variant.

Reading the docs, I understand that:

asset_returns are just the returns solely generated by the asset's cash flow and the proportion of portfolio equity assigned to that particular asset.

returns are (daily as well?) returns as per column/group (a column can be just an asset, in my example below it is just one asset), based on portfolio value. I am not too sure how I shall interpret "portfolio value" here.

daily_returns, doesn't have much info but it is self-explanatory to a degree: we know it is for each day and it is showed as per column/group as well. I don't know whether it is based on portfolio value or not (it seems they do).

My questions are:

  1. Are returns and daily_returns the same? I noticed on 2020-02-25 the return value departed there for them, but other entries looked to be identical; when I set the option use_asset_returns to True, I noticed that daily_returns is showing identical results to asset_returns (with filter for no trade days and NaN days). 1.1 I am inclined to believe that the value on 2020-02-25 shouldn't exist and it is some calculation in the background generated this. Though it is extremely small (e-16).

  2. How do we reconcile asset_returns and returns/daily_returns? If asset_returns was for single asset only, doesn't returns/daily_returns show returns as per column(asset) as well? i.e., Is there a portfolio level daily return which weighted-sum all assets' returns to just one series?

  3. daily_returns produced NaN on days there were no trades happened, e.g., 2020-01-01, 2020-01-04. Why is this?

Here's my simple example, if it's not correctly setup please kindly point it out. I filtered for the trades happened only for first asset just for demonstration.

import numpy as np
import vectorbt as vb

d = vb.YFData.download(["GS", "MSFT"], start="2020-01-01", end="2023-11-20", interval="1d").get("Close")

def cind(close, window=14):
    rsi = vb.RSI.run(close, window = window).rsi.to_numpy()
    trend = np.where( rsi > 70, -1, 0)
    trend = np.where( rsi < 30, 1, trend)
    return trend

ind = vb.IndicatorFactory(class_name="Combination",short_name="comb", input_names=["close"], param_names=["window"],output_names=["value"]).from_apply_func(cind,window=14)

res = ind.run(d)
ent = res.value == 1.0
ex = res.value == -1.0

pf = vb.Portfolio.from_signals(d, ent, ex)

rr=pf.returns()
ar=pf.asset_returns()

odr=pf.daily_returns()
dr=odr.fillna(-999.0)

qr=pf.get_qs().log_returns()

# asset_returns
ar[ar["GS"]!=0.0]

symbol                           GS      MSFT
Date                                         
2020-02-26 05:00:00+00:00 -0.008410  0.000000
2020-02-27 05:00:00+00:00 -0.046761  0.000000
2020-02-28 05:00:00+00:00 -0.017951  0.024213
2020-03-02 05:00:00+00:00  0.043333  0.066539
2020-03-03 05:00:00+00:00 -0.028835 -0.047919
...                             ...       ...
2023-11-02 04:00:00+00:00  0.021487  0.000000
2023-11-03 04:00:00+00:00  0.044174  0.000000
2023-11-06 05:00:00+00:00 -0.011324  0.000000
2023-11-07 05:00:00+00:00  0.000124  0.000000
2023-11-08 05:00:00+00:00  0.001883  0.000000

[491 rows x 2 columns]

# returns
rr[rr["GS"]!=0]

symbol                               GS      MSFT
Date                                             
2020-02-25 05:00:00+00:00  1.421085e-16  0.000000
2020-02-26 05:00:00+00:00 -8.409513e-03  0.000000
2020-02-27 05:00:00+00:00 -4.676063e-02  0.000000
2020-02-28 05:00:00+00:00 -1.795138e-02  0.024213
2020-03-02 05:00:00+00:00  4.333314e-02  0.066539
...                                 ...       ...
2023-11-02 04:00:00+00:00  2.148719e-02  0.000000
2023-11-03 04:00:00+00:00  4.417384e-02  0.000000
2023-11-06 05:00:00+00:00 -1.132407e-02  0.000000
2023-11-07 05:00:00+00:00  1.235175e-04  0.000000
2023-11-08 05:00:00+00:00  1.882961e-03  0.000000

[494 rows x 2 columns]

# daily_returns filtered for no-trade days and NaN days
dr[~dr.GS.isin([0,-999])]

symbol                               GS      MSFT
Date                                             
2020-02-25 00:00:00+00:00  2.220446e-16  0.000000
2020-02-26 00:00:00+00:00 -8.409513e-03  0.000000
2020-02-27 00:00:00+00:00 -4.676063e-02  0.000000
2020-02-28 00:00:00+00:00 -1.795138e-02  0.024213
2020-03-02 00:00:00+00:00  4.333314e-02  0.066539
...                                 ...       ...
2023-11-02 00:00:00+00:00  2.148719e-02  0.000000
2023-11-03 00:00:00+00:00  4.417384e-02  0.000000
2023-11-06 00:00:00+00:00 -1.132407e-02  0.000000
2023-11-07 00:00:00+00:00  1.235175e-04  0.000000
2023-11-08 00:00:00+00:00  1.882961e-03  0.000000

[494 rows x 2 columns]

# daily_returns without filter 
odr

symbol                      GS  MSFT
Date                                
2019-12-31 00:00:00+00:00  0.0   0.0
2020-01-01 00:00:00+00:00  NaN   NaN
2020-01-02 00:00:00+00:00  0.0   0.0
2020-01-03 00:00:00+00:00  0.0   0.0
2020-01-04 00:00:00+00:00  NaN   NaN
...                        ...   ...
2023-11-13 00:00:00+00:00  0.0   0.0
2023-11-14 00:00:00+00:00  0.0   0.0
2023-11-15 00:00:00+00:00  0.0   0.0
2023-11-16 00:00:00+00:00  0.0   0.0
2023-11-17 00:00:00+00:00  0.0   0.0

[1418 rows x 2 columns]
stucash commented 7 months ago

I'd very much appreciate if you or any of the team member could revert to this problem as the daily returns is definitely a crucial part which needs to be aligned for the backtest to be reliable.

I'm aiming to use vectorbt as the main backtesting engine, it is vital for me to understand enough.

Thanks very much.