abbass2 / pyqstrat

A fast, extensible, transparent python library for backtesting quantitative strategies.
BSD 3-Clause "New" or "Revised" License
353 stars 57 forks source link

Does a backtest require a Signal? #23

Closed BlackArbsCEO closed 2 years ago

BlackArbsCEO commented 2 years ago

Greetings,

The reason I ask this is because my strategy uses a machine learning model to make a prediction. I'm using an entry rule for this because it is path dependent. My understanding is that the entry rule would be called on each timestep and then the model makes a prediction. Based on the prediction, a trade is entered.

However all the notebook examples demonstrate the use of a signal which seems to be a vectorized array for all the timesteps. I want to confirm if I can leave the signal component as none or if I need to aggregate the model predictions into an array outside of pyqstrat. Thanks.

abbass2 commented 2 years ago

Hi Brian,

pyqstrat needs a set of timestamps to run the strategy. The signal is an optimization so pyqstrat calls your trading rule function only on those timestamps when the signal is True. If you want to run it for every timestamp, just create a signal that returns a numpy array containing True for all timestamps.

Best,

Sal

BlackArbsCEO commented 2 years ago

Hi @abbass2 . I adjusted it but I noticed my trading rule is being called on every timestamp. I'm not sure why.

Below is my signal rule.

def ml_timestamp_signal(
    contract_group, timestamps, indicators, parent_signals, strategy_context
):
    """
    create an array that matches the timestamps to run the strategy with a true designation

    NOTE: lookback is in days but the data frequency is minutes
    """
    # every 30 minutes after lookback period
    lookback = strategy_context.lookback_period
    first_timestamp = strategy_context.first_timestamp

    signal = pd.Series(timestamps).to_frame().rename(columns={0: "timestamps"})
    hours = pd.Index([6, 7, 8, 12])
    signal["signal"] = np.where(
        (signal.timestamps.dt.hour.isin(hours)) & (signal.timestamps.dt.minute == 30),
        1,
        0,
    )
    signal.loc[signal.timestamps < first_timestamp, "signal"] = 0
    return signal.signal.values

here is my build strategy function

def ml_build_strategy(contract_group, strategy_context):
    """
    custom build strategy
    """
    strategy = pq.Strategy(
        timestamps,
        [contract_group],
        get_price,
        trade_lag=1,
        strategy_context=strategy_context,
    )

    strategy.add_indicator("o", feat_df.o.values)
    strategy.add_indicator("c", feat_df.c.values)
    strategy.add_indicator("h", feat_df.h.values)
    strategy.add_indicator("l", feat_df.l.values)

    strategy.add_signal(
        "ml_timestamp_signal",
        ml_timestamp_signal,
        depends_on_indicators=None,
    )

    # ask pqstrat to call our trading rule when the signal has one of the values [-2, -1, 1, 2]
    strategy.add_rule(
        "ml_trading_rule",
        model_predict_trading_rule,
        signal_name="ml_timestamp_signal",
        sig_true_values=[0, 1],
    )

    strategy.add_market_sim(ml_market_simulator)

    return strategy

Inside the trading_rule I put a print(timestamp) while debugging and that's when I saw that it was iterating over every timestmap.

here is the link to my timestamps and dataset box

abbass2 commented 2 years ago

I would print the output of your signal function and see if it looks reasonable. If it does, could you create a simple standalone example that replicates this?

Best,

Sal

On Mon, Jan 10, 2022 at 11:50 AM Brian @.***> wrote:

Hi @abbass2 https://github.com/abbass2 . I adjusted it but I noticed my trading rule is being called on every timestamp. I'm not sure why.

Below is my signal rule.

def ml_timestamp_signal( contract_group, timestamps, indicators, parent_signals, strategy_context ): """ create an array that matches the timestamps to run the strategy with a true designation

NOTE: lookback is in days but the data frequency is minutes
"""
# every 30 minutes after lookback period
lookback = strategy_context.lookback_period
first_timestamp = strategy_context.first_timestamp

signal = pd.Series(timestamps).to_frame().rename(columns={0: "timestamps"})
hours = pd.Index([6, 7, 8, 12])
signal["signal"] = np.where(
    (signal.timestamps.dt.hour.isin(hours)) & (signal.timestamps.dt.minute == 30),
    1,
    0,
)
signal.loc[signal.timestamps < first_timestamp, "signal"] = 0
return signal.signal.values

here is my build strategy function

def ml_build_strategy(contract_group, strategy_context): """ custom build strategy """ strategy = pq.Strategy( timestamps, [contract_group], get_price, trade_lag=1, strategy_context=strategy_context, )

strategy.add_indicator("o", feat_df.o.values)
strategy.add_indicator("c", feat_df.c.values)
strategy.add_indicator("h", feat_df.h.values)
strategy.add_indicator("l", feat_df.l.values)

strategy.add_signal(
    "ml_timestamp_signal",
    ml_timestamp_signal,
    depends_on_indicators=None,
)

# ask pqstrat to call our trading rule when the signal has one of the values [-2, -1, 1, 2]
strategy.add_rule(
    "ml_trading_rule",
    model_predict_trading_rule,
    signal_name="ml_timestamp_signal",
    sig_true_values=[0, 1],
)

strategy.add_market_sim(ml_market_simulator)

return strategy

Inside the trading_rule I put a print(timestamp) while debugging and that's when I saw that it was iterating over every timestmap.

here is the link to my timestamps and dataset box https://app.box.com/s/llc4ueq83wix74blvim9vxwi7ku46lc7

— Reply to this email directly, view it on GitHub https://github.com/abbass2/pyqstrat/issues/23#issuecomment-1009176166, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCN2V3G4REE35KSAYFLXKTUVML75ANCNFSM5LN3EI5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

BlackArbsCEO commented 2 years ago

sure.

def example_signal(
    contract_group, timestamps, indicators, parent_signals, strategy_context
):
    """
    create an array that matches the timestamps to run the strategy with a true designation
    """
    # every 30 minutes after lookback period
    lookback = strategy_context.lookback_period
    first_timestamp = strategy_context.first_timestamp

    signal = pd.Series(timestamps).to_frame().rename(columns={0: "timestamps"})
    hours = pd.Index([6, 7, 12])
    signal["signal"] = np.where(
        (signal.timestamps.dt.hour.isin(hours)) & (signal.timestamps.dt.minute == 30),
        1,
        0,
    )
    signal.loc[signal.timestamps < first_timestamp, "signal"] = 0
    return signal.signal.values

def example_trading_rule(
    contract_group, i, timestamps, indicators, signal, account, strategy_context
):
    timestamp = timestamps[i]
    curr_pos = account.position(contract_group, timestamp)
    signal_value = signal[i]
    risk_percent = 0.1
    # here have to use indicators to grab feature dataframe
    close_price = indicators.c[i]

    contract = contract_group.get_contract(symbol)
    if contract is None:
        contract = pq.Contract.create(symbol=symbol, contract_group=contract_group)

    # check to see if model needs to be fit/refit

    feat_df = strategy_context.feature_dataframe

    lookback = strategy_context.lookback_period
    first_timestamp = strategy_context.first_timestamp

    prediction = feat_df.target.iloc[i]
    print(f"{timestamp}:: inside trading rule")

    # fit once per period after initialization
    if timestamp in strategy_context.refit_dates:
        # dummy refit
        print(f"{timestamp}:: refitting")
        prediction = feat_df.target.iloc[i]

    else:
        prediction = feat_df.target.iloc[i]

    order = None
    # if we don't already have a position, check if we should enter a trade

    if math.isclose(curr_pos, 0):
        if signal_value == 1 and prediction == 1:
            curr_equity = account.equity(timestamp)
            order_qty = np.round(curr_equity * risk_percent / close_price)  # long only
            trigger_price = close_price
            reason_code = (
                pq.ReasonCode.ENTER_LONG if order_qty > 0 else pq.ReasonCode.ENTER_SHORT
            )
            order = pq.StopLimitOrder(
                contract,
                timestamp,
                order_qty,
                trigger_price,
                reason_code=reason_code,
            )
            # holding period
            future_exit_date = pd.to_datetime(timestamp) + pd.Timedelta(
                strategy_context.holding_period
            )
            print("order timestamp: ", pd.to_datetime(timestamp))
            print("future exit date: ", future_exit_date)
            strategy_context.symbols_data[symbol].future_exit_date = future_exit_date

    elif curr_pos > 0:  # We have a current position, so check if we should exit
        result = strategy_context.symbols_data[symbol].is_time_to_liquidate(timestamp)
        if result:
            print(
                f"is it time to liquidate: {pd.to_datetime(timestamp)} | {result}",
            )
            order_qty = -curr_pos
            reason_code = (
                pq.ReasonCode.EXIT_LONG if order_qty < 0 else pq.ReasonCode.EXIT_SHORT
            )
            order = pq.MarketOrder(
                contract, timestamp, order_qty, reason_code=reason_code
            )
            strategy_context.symbols_data[symbol].future_exit_date = None

    if order is not None:
        if order_qty > 0:
            print(f'Enter: {timestamp.astype("M8[m]")} {order}')
        else:
            print(f'Exit: {timestamp.astype("M8[m]")} {order}')

    return [order] if order is not None else []

#########################
simulator

def example_market_simulator(
    orders, i, timestamps, indicators, signals, strategy_context
):
    """
    custom market simulator
    """
    trades = []
    timestamp = timestamps[i]

    for order in orders:
        cgroup = order.contract.contract_group
        ind = indicators[cgroup]
        o, h, l, c = ind.o[i], ind.h[i], ind.l[i], ind.c[i]

        trade_price = np.nan

        if isinstance(order, pq.MarketOrder):
            trade_price = 0.5 * (o + h) if order.qty > 0 else 0.5 * (o + l)
        elif isinstance(order, pq.StopLimitOrder):
            if (order.qty > 0 and h > order.trigger_price) or (
                order.qty < 0 and l < order.trigger_price
            ):  # A stop order
                trade_price = (
                    0.5 * (order.trigger_price + h)
                    if order.qty > 0
                    else 0.5 * (order.trigger_price + l)
                )
        else:
            raise Exception(f"unexpected order type: {order}")

        if np.isnan(trade_price):
            continue

        trade = pq.Trade(
            order.contract,
            order,
            timestamp,
            order.qty,
            trade_price,
            commission=order.qty * 0.05,
            fee=1,
        )
        print(f'Trade: {timestamp.astype("M8[m]")} {trade}')
        order.status = "filled"

        trades.append(trade)

    return trades

# build strategy and helper funcs 

class SymbolData:
    def __init__(self, symbol):
        self.symbol = symbol
        self.future_exit_date = None
        self.model = None

    def is_time_to_liquidate(self, timestamp):
        if pd.to_datetime(timestamp) >= self.future_exit_date:
            return True
        return False

def get_price(contract, timestamps, i, strategy_context):
    return feat_df.c[i]

def example_build_strategy(contract_group, strategy_context):
    """
    custom build strategy
    """
    strategy = pq.Strategy(
        timestamps,
        [contract_group],
        get_price,
        trade_lag=1,
        strategy_context=strategy_context,
    )

    strategy.add_indicator("o", feat_df.o.values)
    strategy.add_indicator("c", feat_df.c.values)
    strategy.add_indicator("h", feat_df.h.values)
    strategy.add_indicator("l", feat_df.l.values)

    strategy.add_signal(
        "example_signal",
        example_signal,
        depends_on_indicators=None,
    )

    # ask pqstrat to call our trading rule when the signal has one of the values [-2, -1, 1, 2]
    strategy.add_rule(
        "example_trading_rule",
        example_trading_rule,
        signal_name="example_signal",
        sig_true_values=[0, 1],
    )

    strategy.add_market_sim(example_market_simulator)

    return strategy

############    
# run strategy 

import numpy as np
import pandas as pd
import pyqstrat as pq

feat_df = (
    pd.read_csv("data_to_share.csv")
    .iloc[:, 1:]
    .assign(date=lambda df: pd.to_datetime(df.date))
)
feat_df.info(), feat_df.head(2)

# Clear global state so we can rerun without restarting python
pq.ContractGroup.clear()
pq.Contract.clear()

# parameters
cutoff = pd.Timestamp("2020")
feat_df = feat_df.copy().query("date.dt.date < @cutoff")
timestamps = feat_df["date"].values
lookback_period = 30
holding_period = "2 hours"
refit_dates = pd.date_range(feat_df.date.min(), df.date.max(), freq="M")
first_timestamp, _ = get_first_available_timestamp(timestamps, lookback_period)

# context
strategy_context = SimpleNamespace(
    lookback_period=lookback_period,
    symbols_data={},
    feature_dataframe=feat_df,
    holding_period=holding_period,
)
strategy_context.refit_dates = refit_dates
strategy_context.first_timestamp = first_timestamp

symbol = "SPY"
contract_group = pq.ContractGroup.create(symbol)
contract = contract_group.get_contract(symbol)

symbol_class = SymbolData(symbol)
strategy_context.symbols_data[symbol] = symbol_class

strategy = example_build_strategy(contract_group, strategy_context)
strategy.run()
abbass2 commented 2 years ago

Change the parameter sig_true_values in this line to be sig_true_values = [1]. In this case, you are telling the strategy that both 0 and 1 are true values for the signal.

strategy.add_rule(
    "example_trading_rule",
    example_trading_rule,
    signal_name="example_signal",
    sig_true_values=[0, 1],
)

On Mon, Jan 10, 2022 at 6:54 PM Brian @.***> wrote:

sure.

def example_signal( contract_group, timestamps, indicators, parent_signals, strategy_context ): """ create an array that matches the timestamps to run the strategy with a true designation """

every 30 minutes after lookback period

lookback = strategy_context.lookback_period
first_timestamp = strategy_context.first_timestamp

signal = pd.Series(timestamps).to_frame().rename(columns={0: "timestamps"})
hours = pd.Index([6, 7, 12])
signal["signal"] = np.where(
    (signal.timestamps.dt.hour.isin(hours)) & (signal.timestamps.dt.minute == 30),
    1,
    0,
)
signal.loc[signal.timestamps < first_timestamp, "signal"] = 0
return signal.signal.values

def example_trading_rule( contract_group, i, timestamps, indicators, signal, account, strategy_context ): timestamp = timestamps[i] curr_pos = account.position(contract_group, timestamp) signal_value = signal[i] risk_percent = 0.1

here have to use indicators to grab feature dataframe

close_price = indicators.c[i]

contract = contract_group.get_contract(symbol)
if contract is None:
    contract = pq.Contract.create(symbol=symbol, contract_group=contract_group)

# check to see if model needs to be fit/refit

feat_df = strategy_context.feature_dataframe

lookback = strategy_context.lookback_period
first_timestamp = strategy_context.first_timestamp

prediction = feat_df.target.iloc[i]
print(f"{timestamp}:: inside trading rule")

# fit once per period after initialization
if timestamp in strategy_context.refit_dates:
    # dummy refit
    print(f"{timestamp}:: refitting")
    prediction = feat_df.target.iloc[i]

else:
    prediction = feat_df.target.iloc[i]

order = None
# if we don't already have a position, check if we should enter a trade

if math.isclose(curr_pos, 0):
    if signal_value == 1 and prediction == 1:
        curr_equity = account.equity(timestamp)
        order_qty = np.round(curr_equity * risk_percent / close_price)  # long only
        trigger_price = close_price
        reason_code = (
            pq.ReasonCode.ENTER_LONG if order_qty > 0 else pq.ReasonCode.ENTER_SHORT
        )
        order = pq.StopLimitOrder(
            contract,
            timestamp,
            order_qty,
            trigger_price,
            reason_code=reason_code,
        )
        # holding period
        future_exit_date = pd.to_datetime(timestamp) + pd.Timedelta(
            strategy_context.holding_period
        )
        print("order timestamp: ", pd.to_datetime(timestamp))
        print("future exit date: ", future_exit_date)
        strategy_context.symbols_data[symbol].future_exit_date = future_exit_date

elif curr_pos > 0:  # We have a current position, so check if we should exit
    result = strategy_context.symbols_data[symbol].is_time_to_liquidate(timestamp)
    if result:
        print(
            f"is it time to liquidate: {pd.to_datetime(timestamp)} | {result}",
        )
        order_qty = -curr_pos
        reason_code = (
            pq.ReasonCode.EXIT_LONG if order_qty < 0 else pq.ReasonCode.EXIT_SHORT
        )
        order = pq.MarketOrder(
            contract, timestamp, order_qty, reason_code=reason_code
        )
        strategy_context.symbols_data[symbol].future_exit_date = None

if order is not None:
    if order_qty > 0:
        print(f'Enter: {timestamp.astype("M8[m]")} {order}')
    else:
        print(f'Exit: {timestamp.astype("M8[m]")} {order}')

return [order] if order is not None else []

######################### simulator

def example_market_simulator( orders, i, timestamps, indicators, signals, strategy_context ): """ custom market simulator """ trades = [] timestamp = timestamps[i]

for order in orders:
    cgroup = order.contract.contract_group
    ind = indicators[cgroup]
    o, h, l, c = ind.o[i], ind.h[i], ind.l[i], ind.c[i]

    trade_price = np.nan

    if isinstance(order, pq.MarketOrder):
        trade_price = 0.5 * (o + h) if order.qty > 0 else 0.5 * (o + l)
    elif isinstance(order, pq.StopLimitOrder):
        if (order.qty > 0 and h > order.trigger_price) or (
            order.qty < 0 and l < order.trigger_price
        ):  # A stop order
            trade_price = (
                0.5 * (order.trigger_price + h)
                if order.qty > 0
                else 0.5 * (order.trigger_price + l)
            )
    else:
        raise Exception(f"unexpected order type: {order}")

    if np.isnan(trade_price):
        continue

    trade = pq.Trade(
        order.contract,
        order,
        timestamp,
        order.qty,
        trade_price,
        commission=order.qty * 0.05,
        fee=1,
    )
    print(f'Trade: {timestamp.astype("M8[m]")} {trade}')
    order.status = "filled"

    trades.append(trade)

return trades

build strategy and helper funcs

class SymbolData: def init(self, symbol): self.symbol = symbol self.future_exit_date = None self.model = None

def is_time_to_liquidate(self, timestamp):
    if pd.to_datetime(timestamp) >= self.future_exit_date:
        return True
    return False

def get_price(contract, timestamps, i, strategy_context): return feat_df.c[i]

def example_build_strategy(contract_group, strategy_context): """ custom build strategy """ strategy = pq.Strategy( timestamps, [contract_group], get_price, trade_lag=1, strategy_context=strategy_context, )

strategy.add_indicator("o", feat_df.o.values)
strategy.add_indicator("c", feat_df.c.values)
strategy.add_indicator("h", feat_df.h.values)
strategy.add_indicator("l", feat_df.l.values)

strategy.add_signal(
    "example_signal",
    example_signal,
    depends_on_indicators=None,
)

# ask pqstrat to call our trading rule when the signal has one of the values [-2, -1, 1, 2]
strategy.add_rule(
    "example_trading_rule",
    example_trading_rule,
    signal_name="example_signal",
    sig_true_values=[0, 1],
)

strategy.add_market_sim(example_market_simulator)

return strategy

############

run strategy

import numpy as np import pandas as pd import pyqstrat as pq

feat_df = ( pd.read_csv("data_to_share.csv") .iloc[:, 1:] .assign(date=lambda df: pd.to_datetime(df.date)) )feat_df.info(), feat_df.head(2)

Clear global state so we can rerun without restarting python

pq.ContractGroup.clear() pq.Contract.clear()

parameters

cutoff = pd.Timestamp("2020") feat_df = feat_df.copy().query("date.dt.date < @cutoff") timestamps = feat_df["date"].values lookback_period = 30 holding_period = "2 hours" refit_dates = pd.date_range(feat_df.date.min(), df.date.max(), freq="M") firsttimestamp, = get_first_available_timestamp(timestamps, lookback_period)

context

strategy_context = SimpleNamespace( lookback_period=lookback_period, symbols_data={}, feature_dataframe=feat_df, holding_period=holding_period, ) strategy_context.refit_dates = refit_dates strategy_context.first_timestamp = first_timestamp

symbol = "SPY" contract_group = pq.ContractGroup.create(symbol) contract = contract_group.get_contract(symbol)

symbol_class = SymbolData(symbol) strategy_context.symbols_data[symbol] = symbol_class

strategy = example_build_strategy(contract_group, strategy_context) strategy.run()

— Reply to this email directly, view it on GitHub https://github.com/abbass2/pyqstrat/issues/23#issuecomment-1009494737, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCN2V5SRWCXMP4QTNWP4YTUVN5TBANCNFSM5LN3EI5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

BlackArbsCEO commented 2 years ago

thanks.