kernc / backtesting.py

:mag_right: :chart_with_upwards_trend: :snake: :moneybag: Backtest trading strategies in Python.
https://kernc.github.io/backtesting.py/
GNU Affero General Public License v3.0
5.45k stars 1.06k forks source link

Use pytest for unit testing #279

Closed crazy25000 closed 3 years ago

crazy25000 commented 3 years ago

Info

I recommend using pytest for unit testing instead of using the builtin library exclusively. We would benefit from fixtures, parametrization - especially useful for all the possible combinations of parameters, it supports Pythons internal unit tests out of the box, and should reduce boilerplate or having to create utils for testing.

Links

General examples: https://docs.pytest.org/en/stable/example/index.html Parametrize examples: https://docs.pytest.org/en/stable/example/parametrize.html#paramexamples

Example Cases

Here are a few example unit tests I have for my custom backtester:

Example 1 - Function
def shift_hours(date_to_check: str, num_hours: int) -> pd.Timestamp:
    return pd.to_datetime(date_to_check) + pd.offsets.Hour() * num_hours
Example 1 - Unit tests
@pytest.mark.parametrize(
    'date_to_check, num_hours, expected',
    [
        ('2020-01-15 00:00:00+00:00', 0, '2020-01-15 00:00:00+00:00'),
        ('2020-01-15 00:00:00+00:00', 1, '2020-01-15 01:00:00+00:00'),
        ('2020-01-15 00:00:00+00:00', 2, '2020-01-15 02:00:00+00:00'),
        ('2020-01-15 00:00:00+00:00', -2, '2020-01-14 22:00:00+00:00'),
        ('2020-01-15 00:00:00+00:00', -1, '2020-01-14 23:00:00+00:00'),
    ],
)
def test_shift_hours(date_to_check, num_hours, expected):
    assert str(shift_hours(date_to_check, num_hours)) == expected

Here's another example with multiple tests for a strategy using parametrization for more resilient testing. This was useful to speed up testing so I didn't have to duplicate or write custom utils.

Example 2 - Global fixtures available to every unit test if needed

ins_str = ['EUR_USD']
tmfr_str = ['M30']

@pytest.fixture()
def instrument():
    return ins_str

@pytest.fixture()
def timeframe():
    return tmfr_str

@pytest.fixture()
def conn_pool():
    return psycopg2_get_conn_pool()

Example 2 - Local fixture available to unit tests in specific folder

@pytest.fixture()
def strategy_base_params(instrument, timeframe):
    return {
        'end_date': '2020-03-19',
        'instrument': instrument,
        'original_start_date': '2020-03-10',
        'period1': 5,
        'period2': 19,
        'plot_trades': False,
        'save_trades': False,
        'start_date': '2020-03-09 15:00',
        'stop_loss': -0.0005,
        'take_profit': 0.0005,
        'timeframe': timeframe,
    }

Example 2 - Unit tests

@pytest.mark.parametrize(
    'params,expected',
    [
        (
            base_params,
            {'final_balance': 1080.45, 'wins': 8, 'losses': 8, 'sharpe_ratio': 5.03},
        ),
        (
            base_params | {'period1': 9, 'period2': 19},
            {'final_balance': 950.0, 'wins': 1, 'losses': 3, 'sharpe_ratio': -42.3},
        ),
        (
            base_params | {'stop_loss': -0.0020, 'take_profit': 0.0020},
            {'final_balance': 882.0, 'wins': 2, 'losses': 3, 'sharpe_ratio': -48.18},
        ),
    ],
)
def test_strategy_with_trades(params, expected, conn_pool):
    conn = conn_pool.getconn()
    candles = psycopg2_get_candles_from_db(conn, params, cols=', *')
    strategy = Strategy1Ema1Ema2TESTINGONLY(params)
    results = strategy.start_trading(external_price_stream=candles)

    conn_pool.putconn(conn)

    assert strategy.end_date == params['end_date']
    assert strategy.indicators['ema1'] == strategy.period1 == params['period1']
    assert strategy.indicators['ema2'] == strategy.period2 == params['period2']
    assert strategy.instrument == params['instrument']
    assert strategy.params['original_start_date'] == params['original_start_date']
    assert strategy.period_max_window == params['period2'] + 3
    assert strategy.timeframe == params['timeframe']

    assert strategy.trade_manager.stop_loss == params['stop_loss']
    assert strategy.trade_manager.take_profit == params['take_profit']
    assert strategy.trade_manager.trailing_stop_pips_threshold == params['trailing_stop_pips']
    assert strategy.trade_manager.use_trailing_stop == (params['trailing_stop_pips'] > 0.0)

    assert results['final_balance'] == expected['final_balance']
    assert results['losses'] == expected['losses']
    assert results['sharpe_ratio'] == expected['sharpe_ratio']
    assert results['wins'] == expected['wins']
kernc commented 3 years ago

I'm not too big on pytest.

Nor excessive parameterization. It's a question of rewriting:

assert something(1)
assert something(2)

into:

for i in (1,
          2):
    assert something(i)

and both have their uses. Besides, Unittest has subtests, and we do too: https://github.com/kernc/backtesting.py/blob/a49122c72b8e0dd9b30104957638640653c2c113/backtesting/test/_test.py#L555-L561

Have you found any particular deficiencies in functional coverage of our existing tests, or what "boilerplate utils for testing" have you had in mind?

crazy25000 commented 3 years ago

I wouldn't say deficiencies, but we could improve them. Like instead of testing the optimization feature one case per function: https://github.com/kernc/backtesting.py/blob/73e1534428fc2db9e681030862133b5067508520/backtesting/test/_test.py#L505-L508

We could parametrize it so that we test all of them at the same time with one method and pass the args/expected values for these cases:

image

Since you've already made up your mind with pytest, going to close this.

kernc commented 3 years ago

Just don't think the small extra utility justifies switching over existing tests, which work nicely.

Parametrization bids coupling, and coupled code is hard to modify. Particularly with the test cases you reference, I don't see how in the world you'd suggest to factor them onto a common denominator and still retain anywhere near the current clarity. :flushed:

Instead of filling the whole hyperspace of possibilities (and tests running for good fifty minutes), I guess I'm content with small, self-contained, functional tests. So far it hasn't proved a tragedy.

As an example of reasonable parametrization, have you seen the coroutine-based test class? It works quite well with strategies that do just a one-off or a particular sequence of actions, and the boilerplate is pretty thin. https://github.com/kernc/backtesting.py/blob/73e1534428fc2db9e681030862133b5067508520/backtesting/test/_test.py#L411-L502