How to convert back to actual total return on test dataset after model is run?

tszumowski commented 3 years ago

I understand how to generate and plot metrics from running the actual network (or benchmarks) given a dataset (see code below). However, my understanding is those metrics include aggregations over the defined horizon (e.g. MeanReturns or CumulativeReturn).

Is there a support function to, given a benchmark or network that specifies weights, determine what the actual returns would be?

If not, I believe I can just get weights using generate_weights_table, and then I can manually calculate what the actual returns would be on a given day using those weights. I can then cumulate to get a final total return for say the test period. Is that correct?

Snippet from examples for how to generate metrics:

network = network.eval()
benchmarks = {
    "1overN": OneOverN(),  # each asset has weight 1 / n_assets
    "random": Random(),  # random allocation that is however close 1OverN
    "network": network,
}

metrics = {
    "MeanReturn": MeanReturns(output_type="simple"),
    "CumReturn": CumulativeReturn(),
}

metrics_table = generate_metrics_table(benchmarks, dataloader_test, metrics)

plot_metrics(metrics_table)

jankrepl commented 3 years ago

Hey there and thanks for your interest!

I am not 100% sure what you mean by "actual return".

The generate_metrics_table should always give you the target metric assuming you invested at a given point x and held the portfolio for horizon time steps.

I guess you want to compute the return over some longer period than the horizon, right? If that is the case then I have 2 ideas:

The losses (e.g. MeanReturns or CumulativeReturn) should work for any y. That means that you can make the horizon dimension as long as you want. So if you manage to create y that spans over your entire period of interest then this should work.
As you said, you can just extract the predicted portfolio weights at any point and then just do any additional computations outside of deepdow.

tszumowski commented 3 years ago

@jankrepl thank you for the clarification and tips!

Desired Returns Continuation

After thinking about it, I believe I'm interested in two types of total returns:

The one you mentioned above:

the target metric assuming you invested at a given point x and held the portfolio for horizon time steps
The target metric assuming you invest at a given point x and update the portfolio every horizon timesteps
- Per the docs, this would be following this statement:
  
  Ideally, this network should propose the best portfolio w to be held for horizon number of time steps given what happened in the market up until now x.

Notebook Reference

For both of these, I created an example notebook that only contains two assets: AMZN, and IBM as an example. (Credits to the examples here and PR #100 for code guidance!) I attached the notebook here for reference for those interested:

deepdow_linearnet_two_realworld_stocks.ipynb.zip

Experiments with different return types

First starting with (1) above, i.e. holding the portfolio**

I implemented your two suggestions in the notebook and achieved parity which was comforting.

For the CumulativeReturn loss approach (your first suggestion)
- see notebook heading Calculate Total Cumulative Return With New Full-Horizon Test-Period Dataset
For the "additional computations outside DeepDow" approach (your second suggestion)
- see notebook heading Calculate Benchmark Total Returns Manually

Both matched, and weren't too challenging to compute. They both assume a locked portfolio at the start of a long horizon.

Next type, (2), changing the portfolio every horizon

I wasn't sure what's the best way of calculating the final total return over the duration of the test period, where the portfolio is updated every horizon.

Consider a hypothetical example:

lookback: 3 months
test period: 1 year
horizon: 1 month
portfolio is updated every month, where it uses the last 3 months to create X and calculates returns based on 1 month of y.

The CumulativeReturn loss when called with generate_metrics_table() provides a (negated) cumulative return scalar for the defined horizon (1 month in the above example). But it's a sliding window of 1 time-period, with a gap involved.

I was able to get something (though not pretty) in my notebook under heading : Calculate Total Return by Accumulating Returns Across Smaller Horizons from Metrics. In the notebook, lookback=50 and horizon=5.

In the notebook, the individual asset total returns as well as the OneOverN() all match with the above method. This is because the portfolios don't change over each horizon for them. So I assumed it's in the right direction. The network output differs, as expected, since the portfolio gets updated.

Do you have suggestions for a better/cleaner way of calculating total return over a finite time-period where the portfolio is periodically updated within it?

Thanks again!

jankrepl commented 3 years ago

Thank you for the clear explanation!

IMO you are stepping outside of what deepdow was built for. There is nothing wrong about it and I totally agree that this periodic reinvestment is a natural strategy one could come up with. The main problem is that as soon as you introduce reinvesting and updating you need to take into account transaction costs and therefore worry about the turnover.

There was a similar issue on this topic #69.

tszumowski commented 3 years ago

@jankrepl Thank you for the clarification on deepdow's scope, as well as providing that very informative issue #69 reference! It helps clear up what one can expect "off-the-shelf" with deepdow.

Is my understanding below correct?

Using deepdow, one is able to derive a portfolio allocation given asset behaviors in a given lookback window
That portfolio allocation is optimized for the provded horizon used during training
One may assess how a derived portfolio performed for given horizon data in a test set, as shown in this example (metrics plot)
One may also observe how the portfolio's weights can change if inference is run on the network every day.
- In fact, assuming the lookback data is different by day, it is possible the weights can change every day ... depending on the network architecture, losses, etc.
deepdow doesn't attempt to provide rebalancing strategies for how and when to enact those assigned portfolio weights.
- This requires non-trivial work around rebalancing that incorporate fees and turnover conditions
- Weight graphs like those shown here are helpful to understand the portfolio dynamics, but should be considered purely informational
deepdow does provide the portfolio information (weights by timestep, etc) to develop a rebalancing strategy outside of the deepdow package, if needed.

jankrepl commented 3 years ago

That is actually a brilliant description of deepdow. With that being said, it is an open source project and I am always willing to accept any suggestions for new features and extend the scope! I did not really like the idea of adding too many features from the very beginning. Everything can be done incrementally IMO.

As shown in #69 it is actually not that hard to extend deepdow to make conditional predictions (weight allocation) at time t based on predictions at time t-1.

tszumowski commented 3 years ago

@jankrepl thank you again. Regarding extending scope, one thing I've really enjoyed in my short time with deepdow is how powerful and flexible it serves as a foundation! #69 being a great example. I should have added above:

With how deepdow is configured, it can easily be extended to include use cases like managing turnover in rebalancing (e.g. #69)

I appreciate your decision to start with a tight focus to start and extend incrementally. I'm only getting my feet wet so I don't think I'll have any suggestions for core scope additions at this point.

Thank you again for brainstorming and the great dialogue! I'll close this since I think the original question is covered well at this point.

jankrepl / deepdow