An example of working with real stocks data, where the neural network would show good results

vgmakeev commented 3 years ago

There are two simple examples in attached notebook, please take a look:

SPY, TLT, GLD for 10 years — seems like a network doesn't learn. It's little bit better than random.
Set of 13 tickers — the results are average between holding SPY and random. I do not notice the learning process between epochs.

Or may be everything is fine? Do I just need to add some financial indicators, feature engineering and so on?

Anyway it would be great to have a working example with stocks.

Baseline-example-for-jankrepl.ipynb.zip

jankrepl commented 3 years ago

First of all, sorry for the late response!

Second of all, thank you for making the notebook runnable out of the box. It makes my life way easier:)

Let me start off by saying that this issue is open ended and I do not have the time capacity to try to find the best setup for your data. With that being said, I can definitely share with you all my impressions and tips.

seems like a network doesn't learn

If you mean it in a machine learning sense, then I would disagree. The network is being trained - one can see this from the decreasing training loss. Note that to get a clear overview over the training loss at epoch ends you need to provide the training dataloader via val_dataloaders!. Additionally, feel free to check out the TensorBoardCallback where you can actually inspect activations over training batches.

run = Run(network,
          loss,
          dataloader,
          val_dataloaders={'val': val_dataloader,
                           'train': dataloader},  # <--- !!!!
          metrics=metrics,
          benchmarks=benchmarks
         )

It's little bit better than random.

I guess here you are referring to the validation loss. This is the place where one needs to do a lot of experimentation and see what models / losses work best. However, the golden advice is to :

Make sure your model can "overfit" (~ significantly better than random strategy) the training set - I believe that is the case if one increases the n_epochs.
If 1) is satisfied start regularizing / simplifying your architecture until your validation loss starts to beat the benchmarks.

With that being said, slightly better than random guessing is not bad in finance!:)

-0.157 — is it the Sharpe ratio? Seems the value is too low, how to interpret this?

In deepdow all losses have "the lower the better" logic even though the name of the loss might suggest otherwise. The loss SharpeRatio actually computes the negative of the real sharpe ratio. See docs

See below a couple of general tips

For starters, write your custom network that uses the SoftmaxAllocator as the last layer. It is going to make the training way quicker and it does not require the concept of a covariance matrix. The existing networks BachelierNet, KeynesNet are by no means optimal - they are in the source code to demonstrate the flexibility of deepdow networks.
Deal with with standardization of features either via a normalization layer or via standard or robust scaling - for the latter see this example . Try to analyze what standardization techniques work the best for your dataset.
Use the TensorBoardCallback to get way more insight into what is going on. The progress bar just prints out summary values and the history object is only available after training

jankrepl commented 3 years ago

Closing due to inactivity. If you have any further comments I can always reopen.

jankrepl / deepdow

An example of working with real stocks data, where the neural network would show good results #100