edtechre / pybroker

Algorithmic Trading in Python with Machine Learning
https://www.pybroker.com
Other
2.09k stars 262 forks source link

How to use backtest warmup property correct? #48

Closed Pirat83 closed 1 year ago

Pirat83 commented 1 year ago

Hello,

I currently have some questions about the warmup property in the Strategy.backtest(...) method. My goal is to set the propper warmup period that is required to initialize all my indicators and to start trading at a specific date.

So let's calculate a ROC Indicator with the lenght 5 and start trading at the first of January.

    warmup: int = 5
    roc = pybroker.indicator('roc', lambda data: pandas_ta.roc(Series(data.close), length=warmup))

    start_date: datetime = datetime(2023, 1, 1, tzinfo=pytz.timezone('America/New_York'))
    end_date: datetime = datetime(2023, 2, 1, tzinfo=pytz.timezone('America/New_York'))

    strategy: Strategy = Strategy(
        Alpaca(os.getenv('ALPACA_KEY_ID'), os.getenv('ALPACA_SECRET')),
        start_date, end_date,
        StrategyConfig(initial_cash=10000, exit_on_last_bar=True)
    )
    strategy.set_before_exec(before_exec)
    strategy.add_execution(exec_fn, ['IYY', 'IWM', 'IVV'], indicators=[roc])

    result: TestResult = strategy.backtest(timeframe='1d', warmup=warmup)

So we need 5 periods to get the ROC(5) indicator calculated. This takes until the 2023-01-10 and the first trade is opened on the 2023-01-11:

date                              cash           equity           margin        market_value   pnl     unrealized_pnl  fees                                                                                         
2023-01-03 05:00:00  10000.00  10000.00      0.00                     10000.00    0.00   0.0                       0.0
2023-01-04 05:00:00  10000.00  10000.00      0.00                     10000.00    0.00   0.0                       0.0
2023-01-05 05:00:00  10000.00  10000.00      0.00                     10000.00    0.00   0.0                       0.0
2023-01-06 05:00:00  10000.00  10000.00      0.00                     10000.00    0.00   0.0                       0.0
2023-01-09 05:00:00  10000.00  10000.00      0.00                     10000.00    0.00   0.0                       0.0
2023-01-10 05:00:00  10000.00  10000.00      0.00                     10000.00    0.00   0.0                       0.0
2023-01-11 05:00:00        30.70  10082.50      0.00                     10082.50    82.50  0.0                      0.0

My original intention was to use 'Strategy.backtest(start_date, end_date, ...)' to start trading at the 2023-01-01 and to use Pyhtons 'timedelta(days=warmup) to shift the data fetching so the warmup period is substracted from the start_date. In this way we could start trading exactly at the 2023-01-01 (where stock exchanges closed so it was in reality the 2023-01-03) .

    warmup: int = 5
    roc = pybroker.indicator('roc', lambda data: pandas_ta.roc(Series(data.close), length=warmup))

    start_date: datetime = datetime(2023, 1, 1, tzinfo=pytz.timezone('America/New_York'))
    end_date: datetime = datetime(2023, 2, 1, tzinfo=pytz.timezone('America/New_York'))

    strategy: Strategy = Strategy(
        Alpaca(os.getenv('ALPACA_KEY_ID'), os.getenv('ALPACA_SECRET')),
        start_date - timedelta(days=warmup), end_date,
        StrategyConfig(initial_cash=10000, exit_on_last_bar=True)
    )
    strategy.set_before_exec(before_exec)
    strategy.add_execution(exec_fn, ['IYY', 'IWM', 'IVV'], indicators=[roc])

    result: TestResult = strategy.backtest(start_date, end_date, timeframe='1d', warmup=warmup)

But doing so this exception is thown:

Traceback (most recent call last):
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py", line 582, in _validate_comparison_value
    self._check_compatible_with(other)
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py", line 461, in _check_compatible_with
    self._assert_tzawareness_compat(other)
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py", line 694, in _assert_tzawareness_compat
    raise TypeError(
TypeError: Cannot compare tz-naive and tz-aware datetime-like objects.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py", line 1054, in _cmp_method
    other = self._validate_comparison_value(other)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py", line 585, in _validate_comparison_value
    raise InvalidComparison(other) from err
pandas.core.arrays.datetimelike.InvalidComparison: 2023-01-01 00:00:00-04:56

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pirat/PycharmProjects/pybroker-experiments/main.py", line 120, in <module>
    main()
  File "/home/pirat/PycharmProjects/pybroker-experiments/main.py", line 110, in main
    result: TestResult = strategy.backtest(start_date, end_date, timeframe='1d', warmup=warmup)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pybroker/strategy.py", line 1060, in backtest
    return self.walkforward(
           ^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pybroker/strategy.py", line 1179, in walkforward
    df = self._filter_dates(
         ^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pybroker/strategy.py", line 1340, in _filter_dates
    df = _between(df, start_date, end_date).reset_index(drop=True)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pybroker/strategy.py", line 79, in _between
    (df[DataCol.DATE.value].dt.tz_localize(None) >= start_date)
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/ops/common.py", line 72, in new_method
    return method(self, other)
           ^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arraylike.py", line 62, in __ge__
    return self._cmp_method(other, operator.ge)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/series.py", line 6243, in _cmp_method
    res_values = ops.comparison_op(lvalues, rvalues, op)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/ops/array_ops.py", line 273, in comparison_op
    res_values = op(lvalues, rvalues)
                 ^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/ops/common.py", line 72, in new_method
    return method(self, other)
           ^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arraylike.py", line 62, in __ge__
    return self._cmp_method(other, operator.ge)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py", line 1056, in _cmp_method
    return invalid_comparison(self, other, op)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.conda/envs/pybroker-experiments/lib/python3.11/site-packages/pandas/core/ops/invalid.py", line 36, in invalid_comparison
    raise TypeError(f"Invalid comparison between dtype={left.dtype} and {typ}")
TypeError: Invalid comparison between dtype=datetime64[ns] and datetime

So I came across an other idea to move the backtest start_date into future remove the warmup parameter, but this resulted into the same exception:

    result: TestResult = strategy.backtest(start_date + timedelta(days=5), end_date, timeframe='1d')

So I came to the conclusion that Strategy start_date and end_date should not be different then Strategy.backtest(start_date, end_date) and therefore I dont understand why this redundancy is kept?

So idealy I would like to simply fetch i.e 1 month more data then required from the start_date from Alpaca and then start backtesting at exaclty the start_date (or the next candle). Is it possible to achive this?

Pirat83 commented 1 year ago

Example is here: https://github.com/Pirat83/pybroker-experiments/blob/master/main.py

edtechre commented 1 year ago

The start_date and end_date passed to the Strategy constructor specifies the date range to download from a DataSource. The start_date and end_date passed to Strategy#backtest can be a subset of that date range. The warmup parameter specifies the number of bars to skip after the start_date before running the Strategy.

Pirat83 commented 1 year ago

Okay so it should work like desiered. I will give it a try. Thank you very much.