dkharrat commented 7 months ago

Bug Report

I have a strategy that subscribes to 5-min bars, and I feed both 1-min bar and 5-min bar data. Providing the more granular 1-min bar data results in more accurate fills. However, I'm noticing that Nautilus is sometimes filling the orders incorrectly.

Here's a screenshot of how Nautilus executed the trade:

Sell order was submitted at 9:45am with a limit of 18530 and TP of 18505 and SL of 18545.
At 9:53am, order was correctly filled at 18530 (notice that Nautilus used the 1-min bar data for entry)
At 9:55am, TP order incorrectly executed at 18505 (which is impossible based on the 1-min bar data)

Expected Behavior

Based on 1-min bar data, TP order at 18505 should have never executed (see chart above). Instead, the SL should be executed.

Actual Behavior

The TP order is executed at 18505.

Steps to Reproduce the Problem

Here's a MRE that reproduces the issue:

Bar data

nq_1min.csv nq_5min.csv

`debug_backtest.py`

import pytz
from datetime import date

import pandas as pd

from nautilus_trader.backtest.engine import BacktestEngine
from nautilus_trader.backtest.engine import BacktestEngineConfig
from nautilus_trader.backtest.models import FillModel
from nautilus_trader.model.objects import Price, Quantity
from nautilus_trader.config import LoggingConfig
from nautilus_trader.config import RiskEngineConfig
from nautilus_trader.model.currencies import USD
from nautilus_trader.model.data import BarType
from nautilus_trader.model.enums import AccountType
from nautilus_trader.model.enums import OmsType
from nautilus_trader.model.identifiers import TraderId
from nautilus_trader.model.identifiers import Venue
from nautilus_trader.model.objects import Money
from nautilus_trader.persistence.wranglers import BarDataWrangler
from nautilus_trader.model.instruments import FuturesContract
from nautilus_trader.model.identifiers import InstrumentId
from nautilus_trader.model.identifiers import Symbol
from nautilus_trader.model.enums import AssetClass

from debug_strategy import DebugStrategy, DebugStrategyConfig, ts_to_datetime

def nq_futures() -> FuturesContract:
    return FuturesContract(
        instrument_id=InstrumentId(symbol=Symbol("NQ00"), venue=Venue("CME")),
        raw_symbol=Symbol("NQ00"),
        underlying="NQ",
        asset_class=AssetClass.INDEX,
        currency=USD,
        price_precision=2,
        price_increment=Price.from_str("0.25"),
        multiplier=Quantity.from_int(20),
        lot_size=Quantity.from_int(1),
        activation_ns=pd.Timestamp(date(2023, 12, 30), tz=pytz.utc).value,
        expiration_ns=pd.Timestamp(date(2024, 12, 30), tz=pytz.utc).value,
        ts_event=0,
        ts_init=0,
    )

def read_bars(filepath: str) -> pd.DataFrame:
    df = pd.read_csv(filepath)
    df["Timestamp"] = pd.to_datetime(df["Timestamp"], utc=True, format="ISO8601")
    df = df.set_index("Timestamp")
    df.index = df.index.tz_convert("America/New_York")
    return df

def get_positions(engine: BacktestEngine) -> pd.DataFrame:
    positions = []
    for p in engine.kernel.cache.positions():
        positions.append(
            {
                "Entry Side": p.entry.name,
                "Open Time": ts_to_datetime(p.ts_opened),
                "Close Time": ts_to_datetime(p.ts_closed),
                "Open Price": p.avg_px_open,
                "Close Price": p.avg_px_close,
                "PnL": p.realized_pnl,
            }
        )

    return pd.DataFrame(positions)

if __name__ == "__main__":
    config = BacktestEngineConfig(
        trader_id=TraderId("BACKTESTER-001"),
        logging=LoggingConfig(log_level="ERROR"),
        risk_engine=RiskEngineConfig(
            bypass=True,
        ),
    )

    engine = BacktestEngine(config=config)

    fill_model = FillModel(
        prob_fill_on_limit=1,
        prob_fill_on_stop=1,
        prob_slippage=0,
        random_seed=42,
    )

    CME = Venue("CME")
    engine.add_venue(
        venue=CME,
        oms_type=OmsType.NETTING,
        account_type=AccountType.MARGIN,
        base_currency=USD,
        starting_balances=[Money(100_000, USD)],
        fill_model=fill_model,
        bar_execution=True,  # If bar data should move the market (True by default)
        reject_stop_orders=False,  # don't reject stop order if it's already in the market
    )

    NQ_FUTURES = nq_futures()
    engine.add_instrument(NQ_FUTURES)

    ONE_MIN_BAR_TYPE = BarType.from_str("NQ00.CME-1-MINUTE-LAST-EXTERNAL")
    FIVE_MIN_BAR_TYPE = BarType.from_str("NQ00.CME-5-MINUTE-LAST-EXTERNAL")

    one_min_bars = BarDataWrangler(
        bar_type=ONE_MIN_BAR_TYPE,
        instrument=NQ_FUTURES,
    ).process(data=read_bars("nq_1min.csv"))

    five_min_bars = BarDataWrangler(
        bar_type=FIVE_MIN_BAR_TYPE,
        instrument=NQ_FUTURES,
    ).process(data=read_bars("nq_5min.csv"))

    engine.add_data(one_min_bars)
    engine.add_data(five_min_bars)

    strategy = DebugStrategy(
        config=DebugStrategyConfig(
            instrument_id=NQ_FUTURES.id,
            bar_type=FIVE_MIN_BAR_TYPE,
        )
    )
    engine.add_strategy(strategy=strategy)

    engine.run()

    total_positions = len(
        engine.kernel.cache.positions() + engine.kernel.cache.position_snapshots()
    )
    print(f"Total positions:  {total_positions:>10}")

    pnl_stats = engine.portfolio.analyzer.get_performance_stats_pnls()
    print(f"    Pnl (total):  {pnl_stats['PnL (total)']:>10,.2f}")

    print("Positions:")
    print(get_positions(engine).to_markdown(index=False))

    engine.reset()
    engine.dispose()

`debug_strategy.py`

from datetime import timedelta
from datetime import datetime

from nautilus_trader.config import StrategyConfig
from nautilus_trader.model.data import Bar
from nautilus_trader.model.data import BarType
from nautilus_trader.model.enums import OrderSide
from nautilus_trader.model.enums import OrderType
from nautilus_trader.model.enums import TimeInForce
from nautilus_trader.model.identifiers import InstrumentId
from nautilus_trader.model.instruments import Instrument
from nautilus_trader.model.orders.list import OrderList
from nautilus_trader.trading.strategy import Strategy
import pytz

class DebugStrategyConfig(StrategyConfig, frozen=True):
    instrument_id: InstrumentId
    bar_type: BarType

class DebugStrategy(Strategy):
    def __init__(self, config: DebugStrategyConfig) -> None:
        super().__init__(config)

        self.instrument_id = config.instrument_id
        self.bar_type = config.bar_type

        self.instrument: Instrument | None = None  # Initialized in on_start

    def on_start(self) -> None:
        self.instrument = self.cache.instrument(self.instrument_id)
        if self.instrument is None:
            self.log.error(f"Could not find instrument for {self.instrument_id}")
            self.stop()
            return

        self.request_bars(self.bar_type)
        self.subscribe_bars(self.bar_type)

    def on_bar(self, bar: Bar) -> None:
        bar_time = ts_to_datetime(bar.ts_event)
        print(bar)

        if bar_time.hour == 9 and bar_time.minute == 45:
            self.sell(bar, entry_price=18530.00)

    def buy(self, last_bar: Bar, entry_price: float) -> None:
        order_list: OrderList = self.order_factory.bracket(
            instrument_id=self.instrument_id,
            order_side=OrderSide.BUY,
            quantity=self.instrument.make_qty(1),
            entry_order_type=OrderType.LIMIT,
            entry_price=self.instrument.make_price(entry_price),
            tp_price=self.instrument.make_price(entry_price + 25),
            sl_trigger_price=self.instrument.make_price(entry_price - 25),
            time_in_force=TimeInForce.GTD,
            expire_time=self.clock.utc_now() + timedelta(hours=8),
        )

        self.submit_order_list(order_list)

    def sell(self, last_bar: Bar, entry_price: float) -> None:
        order_list: OrderList = self.order_factory.bracket(
            instrument_id=self.instrument_id,
            order_side=OrderSide.SELL,
            quantity=self.instrument.make_qty(1),
            entry_order_type=OrderType.LIMIT,
            entry_price=self.instrument.make_price(entry_price),
            tp_price=self.instrument.make_price(entry_price - 25),
            sl_trigger_price=self.instrument.make_price(entry_price + 25),
            time_in_force=TimeInForce.GTD,
            expire_time=self.clock.utc_now() + timedelta(hours=8),
        )

        self.submit_order_list(order_list)

    def on_reset(self) -> None:
        pass

    def on_stop(self) -> None:
        self.cancel_all_orders(self.instrument_id)
        self.close_all_positions(self.instrument_id)

        self.unsubscribe_bars(self.bar_type)
        self.unsubscribe_quote_ticks(self.instrument_id)

def ts_to_datetime(ts: int) -> datetime:
    return datetime.fromtimestamp(ts / 1e9, tz=pytz.timezone("America/New_York"))

When running the above code, you should get this output:

Total positions:           1
    Pnl (total):      500.00

Positions:
| Entry Side   | Open Time                 | Close Time                |   Open Price |   Close Price |   PnL |
|:-------------|:--------------------------|:--------------------------|-------------:|--------------:|------:|
| SELL         | 2024-03-22 09:53:00-04:00 | 2024-03-22 09:55:00-04:00 |        18530 |         18505 |   500 |

Specifications

OS platform: Ubuntu
Python version: 3.11
nautilus_trader version: 1.90.0

cjdsellers commented 7 months ago

Hey @dkharrat

Thanks for the detail report and MRE. I looked briefly at the data and I see

5 minute bars:

2024-03-22 09:50:00-04:00,18508.5,18518.25,18500.0,18503.75,8664
2024-03-22 09:55:00-04:00,18504.0,18540.5,18504.0,18538.25,7980

1 minute bars:

2024-03-22 09:50:00-04:00,18509.0,18514.75,18503.25,18503.75,1351
2024-03-22 09:51:00-04:00,18504.0,18515.75,18504.0,18511.25,1320
2024-03-22 09:52:00-04:00,18511.5,18515.25,18507.5,18512.25,849
2024-03-22 09:53:00-04:00,18512.0,18531.0,18511.5,18528.75,2128
2024-03-22 09:54:00-04:00,18529.25,18537.5,18529.0,18532.75,2168
2024-03-22 09:55:00-04:00,18532.5,18540.5,18532.25,18538.25,1515

For 5 minute bars, at 09:55:00 the open and low of the interval were both 18504.0, so the TP BUY order executes at 18505.0.

Bar execution is quite simplistic so as each bar hits the matching engine, the OHLC will be converted to ticks and iterated - which is why the above happens.

At the same time this comes back to a previous conversation we've had on bar execution, how do we know we didn't open at 09:50:00 @ 18509.0, and then traded to a high of 18540.5, low of 18504.0, and close of 18538.25 at around 09:54:59.999999999?

With individual bar time frames we don't know, only with the additional granularity of the 1 minute bars we know this wasn't the case. But the logic of keeping track of this has not been added to the platform, which I wouldn't actually consider a bug as such - its working to spec, but just not supporting multiple time frame bar execution for the same instrument as accurately as you're expecting.

Do you require the 5 minute bars for signals, and the 1 minute bars you're using for finer grained execution? otherwise an immediate work around is to stick to a single bar time frame per instrument.

We could keep track of the finest granularity bar for each instrument and ignore the rest for execution, but are you aware of another event driven trading platform which accounts for this for bar execution? If so, it will be interesting to see what the implementation is.

cjdsellers commented 7 months ago

For multiple time frame bars, I'm aware of another trader who sets bar_execution to false, and also provides trade ticks for the backtest (with internal bar aggregation - but this would also work for external aggregation too).

You could also use the same method and pre-process the bar data to trade ticks.

https://github.com/nautechsystems/nautilus_trader/blob/develop/nautilus_trader/backtest/engine.pyx#L373

cjdsellers commented 7 months ago

Hey @dkharrat

I've added an implementation to keep track of the lowest time frame bar per instrument for execution, pushed to develop branch. There is an outstanding edge case that the very initial bars may still exhibit the behaviour you reported until all bar types are seen.

Let me know how this goes for you when you can.

dkharrat commented 7 months ago

At the same time this comes back to a previous conversation we've had on bar execution, how do we know we didn't open at 09:50:00 @ 18509.0, and then traded to a high of 18540.5, low of 18504.0, and close of 18538.25 at around 09:54:59.999999999?

That's why I'm providing the 1-min bar data. And the main reason why I'm also providing the 5-min bar data is because Nautilus doesn't seem to automatically aggregate. However, even if we only had 5-min bar data, I think Nautilus could use reasonable heuristics to determine if the fill makes sense. For example, for a large green bar, it's statistically more likely that the low occurred before the close/high (though, this is probably a separate topic for this particular issue).

Do you require the 5 minute bars for signals, and the 1 minute bars you're using for finer grained execution?

Yes, my strategy uses 5-min bar data to calculate indicators and signals. I suppose I could manually aggregate the 1-min bar into 5-mins within the strategy and manually update the indicators, but it would be cleaner and more ideal if Nautilus handled this natively.

I think one potential solution to this would be to feed Nautilus the highest-granularity bar data (e.g. 1-min bars) only and Nautilus would internally perform the higher-timeframe aggregation based on whatever timeframe the strategy subscribes to. This way, the strategy can use whatever timeframe it requires, while Nautilus can use the higher-granularity data for execution. I believe that's what most backtesting platforms do to address this issue.

For multiple time frame bars, I'm aware of another trader who sets bar_execution to false, and also provides trade ticks for the backtest (with internal bar aggregation - but this would also work for external aggregation too).

I'm not sure if I completely follow. Could you please provide some example or code pointers?

dkharrat commented 7 months ago

I've added an implementation to keep track of the lowest time frame bar per instrument for execution

Thanks for the quick fix! I'll give it a try and let you know how it goes. I believe the edge case you mentioned could be handled by making sure I add the 1-min bars before the 5-min bars, correct?

dkharrat commented 7 months ago

@cjdsellers I just tried the same test strategy I pasted above and it didn't seem to make a difference. I get the same output.

cjdsellers commented 7 months ago

I updated the logic, please give it another try when you get a chance.

cjdsellers commented 7 months ago

And the main reason why I'm also providing the 5-min bar data is because Nautilus doesn't seem to automatically aggregate

What behavior would you expect for this?

Nautilus does aggregate "any" bar type from trade or quote ticks, the DataEngine will arrange the necessary subscriptions - also see the aggregation module. It would be a further enhancement to aggregate higher time frame bars from lower time frame bars, could be done but not part of the spec currently (we've generally catered more for order book / top-of-book type data).

I'm not sure if I completely follow. Could you please provide some example or code pointers?

The general idea here is:

Use the wranglers or your own function to pre-process your bar data into either trade or quote ticks
Add these ticks to the backtest engine
Specify your subscription(s) bar type as INTERNAL aggregation, and LAST price type for trades

The benefit of this is that you can set bar_execution to False so that bar data points aren't involved with execution, and your strategy logic will also work between backtest and live trading too (where real-world, you would either subscribe for bars aggregated by the venue EXTERNAL, or per this example let Nautilus handle the aggregation INTERNAL).

[edit] Having said all of that, I'm hoping the recent additions to OrderMatchingEngine.process_bar will have things working as you were expecting anyway (for backtesting).

dkharrat commented 7 months ago

Use the wranglers or your own function to pre-process your bar data into either trade or quote ticks

That's an interesting idea. This is also a good way to customize how bars are interpreted and used for execution. A few questions:

To convert a bar to ticks, do I just generate 4 ticks for each bar (one for each of open, high, low, close)? What about side? And I assume for volume, I would divide it by 4?
To convert a bar to quotes, how do I choose the bid vs ask? Do I just set both bid/ask to the same value?
Which one do you recommend? Since both are just simulating ticks, is there an advantage of choosing to convert to ticks vs quotes?

The benefit of this is that you can set bar_execution to False so that bar data points aren't involved with execution, and your strategy logic will also work between backtest and live trading too (where real-world, you would either subscribe for bars aggregated by the venue EXTERNAL, or per this example let Nautilus handle the aggregation INTERNAL).

Nice! I was not aware of the EXTERNAL vs INTERNAL distinction and that Nautilus will automatically aggregate INTERNAL data. This seems like a reasonable approach.

dkharrat commented 7 months ago

Having said all of that, I'm hoping the recent additions to OrderMatchingEngine.process_bar will have things working as you were expecting anyway (for backtesting).

I just tried the latest commit in the develop branch using the debug_strategy.py above, but it still didn't work for me. I got the same output as before.

cjdsellers commented 7 months ago

This appears to be working for me on latest develop. Double checking you're recompiling when you pull the latest?

The logic is quite simple, note the additional log I've added temporarily and changed the log levels to warning just so that it was easier to debug:

        if execution_bar_type is None:
            execution_bar_type = bar_type
            self._execution_bar_types[instrument_id] = bar_type
            self._execution_bar_deltas[bar_type] = bar_type.spec.timedelta

        if execution_bar_type != bar_type:
            bar_type_timedelta = self._execution_bar_deltas.get(bar_type)
            if bar_type_timedelta is None:
                bar_type_timedelta = bar_type.spec.timedelta
                self._execution_bar_deltas[bar_type] = bar_type_timedelta
            if self._execution_bar_deltas[execution_bar_type] >= bar_type_timedelta:
                self._execution_bar_types[instrument_id] = bar_type
            else:
                self._log.warning(f"Not regarding {bar_type}")  # TODO!
                return

        if is_logging_initialized():
            self._log.warning(f"Processing {repr(bar)}")

You can see the 1-MINUTE bar is not used for execution, only 1-SECOND bars are processed:

2023-01-05T19:20:50.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.14,126.15,126.14,126.15,321,1672946449000000000)
2023-01-05T19:20:53.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.14,126.14,126.14,126.14,200,1672946452000000000)
2023-01-05T19:20:55.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.14,126.14,126.12,126.12,1148,1672946454000000000)
2023-01-05T19:20:56.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.12,126.12,126.12,126.12,100,1672946455000000000)
2023-01-05T19:20:58.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.13,126.13,126.13,126.13,500,1672946457000000000)
2023-01-05T19:21:00.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Not regarding AAPL.XNAS-1-MINUTE-LAST-EXTERNAL
2023-01-05T19:21:00.000000000Z [INFO] BACKTESTER-001.EMACrossLongOnly: Bar(AAPL.XNAS-1-MINUTE-LAST-EXTERNAL,126.19,126.21,126.12,126.13,12627,1672946400000000000)

I understand it would be better for you if I used your MRE but I happened to be working on this bars example so used this instead.

Also of note, the bar at 59 is missing because Databento bars are only "printed" when there is a trade within the interval.

cjdsellers commented 7 months ago

To convert a bar to ticks, do I just generate 4 ticks for each bar (one for each of open, high, low, close)? What about side? And I assume for volume, I would divide it by 4?

Yes, essentially that's what the wranglers and the matching engine is doing. It looks like aggressor side is just alternating between buyer and seller - there could probably be a more accurate heuristic created for that.

To convert a bar to quotes, how do I choose the bid vs ask? Do I just set both bid/ask to the same value?

Your bar data is probably based on trades, unless you specifically have bid and ask side bars (there are examples of combining bid and ask side bars into quote ticks in the repo - see the examples and wranglers).

Which one do you recommend? Since both are just simulating ticks, is there an advantage of choosing to convert to ticks vs quotes?

We refer to both quotes and trades as "tick" data i.e. QuoteTick and TradeTick, although this terminology can be a little loose as to what a tick actually is. I'd just stick with trades for this use case.

nautechsystems / nautilus_trader

Incorrect order fills with bar data #1580

Bug Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Bar data

`debug_backtest.py`

`debug_strategy.py`

Specifications