nautechsystems / nautilus_trader

A high-performance algorithmic trading platform and event-driven backtester
https://nautilustrader.io
GNU Lesser General Public License v3.0
1.95k stars 445 forks source link

Add indicator values to streamed feather files #1356

Open benjaminsingleton opened 9 months ago

benjaminsingleton commented 9 months ago

Feature Request

After running a backtest, I'd like to be able to review and inspect all changes to indicator values. For example, I'd like to be able to plot the values from moving average indicators alongside prices and trades to get a wholistic view of the strategy. Using the default StreamingConfig settings and implementing a basic crossover strategy, like the EMA strategy in the project examples, the streamed feather files don't appear to include indicators.

Separately, I noticed that the .feather files saved to the backtest catalog directory cannot be read using pd.read_feather or feather.read_feather, as I would have expected. I get the following error: ArrowInvalid: Not a Feather V1 or Arrow IPC file.

To read one of these feather files to a pandas DataFrame, I had to do the following :

import fsspec
import pyarrow as pa

fs = fsspec.filesystem("file")

with fs.open("catalog/backtest/8eb8c808-2daa-4c06-bc24-4acfea038e55/position_opened.feather") as f:
    reader = pa.ipc.open_stream(f)
    df = reader.read_pandas()

Is that expected? If so, I think it would be helpful to either convert the feather files so that they can be read in a more conventional way or create a nautilus_trader.persistence.read_feather convenience function.

cniqvtcfgt commented 8 months ago

@cjdsellers I'm also interested in this enhancement.

If I were to create a custom data type for indicator data, and publish that data to the MessageBus using publish_data, would that also result in .feather files containing the indicator data after a backtest run? Or are additional steps required?

Thanks!

benjaminsingleton commented 7 months ago

@cjdsellers Revisiting this as I'm finding it important to stream / persist indicator values, so I can visually validate strategies. How would you think about implementing this? Is @cniqvtcfgt on the right track?

cjdsellers commented 7 months ago

@cjdsellers I'm also interested in this enhancement.

If I were to create a custom data type for indicator data, and publish that data to the MessageBus using publish_data, would that also result in .feather files containing the indicator data after a backtest run? Or are additional steps required?

Thanks!

Hi @cniqvtcfgt

So the message bus and external publishing configured through MessageBusConfig is different to the feather writing configured through StreamingConfig.

If you're looking at the code then checkout StreamingFeatherWriter and also generate_signal_class and SignalData. Either this enhancement is already available just by leveraging SignalData, or the pattern could be used as inspiration for adding indicator value persistence.

benjaminsingleton commented 7 months ago

This turned out to be simpler than I initially thought, especially if you're not looking to track too many indicator values. All I had to do was use self.publish_signal in my strategy and specify the indicator value I was interested in.

For others' benefit, if you look at the ema_cross.py example in nautilus_trader/examples/strategies, you could add something like self.publish_signal("fast_sma", self.fast_sma.value, bar.ts_event) within the self.on_bar method. As long as you've set up a StreamingConfig, a new feather file named custom_signal_fast_sma.feather will be created, capturing the fast simple moving average at each bar event.

The main limitation of this method is that publish_signal can only handle 3 parameters: ts_event, ts_init, and value. Given that many indicators come with a variety of attributes (for instance, the Swings indicator includes ~10 attributes), publishing each would be cumbersome.

To address this, I think @cniqvtcfgt's approach is correct. First, you create a custom data type, e.g. SwingsIndicatorData. Then you need to register the data type for serialization using register_arrow following the same approach used to define and register Betfair custom data types. You then instantiate a SwingsIndicatorData object whenever the indicator updates and use publish_data to publish it. If I'm not mistaken, once a data type is registered with register_arrow, any published objects will be automatically streamed to feather files.

cjdsellers commented 7 months ago

Potentially you could just pass a dict for value, if you wanted multiple values - but would have to test that.

Otherwise what you describe sounds correct.