ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.75k stars 752 forks source link

Bad performing example for NNAgent #51

Open WojciechMigda opened 6 years ago

WojciechMigda commented 6 years ago

Hi,

thank you for making your framework publically available. I was playing with it recently and I while testing it on different timeframes and coin setups I stumbled upon a set of input data where NNAgent performs worse than UBAH, CRP, BEST algos. I thought you, or people visiting your repo, might be interested in examining it. I might even be interesting to include it (or some other input sets you have) in the next version of your paper to demonstrate where it isn't performing.

This is the json config file I used:

{
  "layers":
  [
    {"filter_shape": [1, 2], "filter_number": 3, "type": "ConvLayer"},
    {"filter_number":10, "type": "EIIE_Dense", "regularizer": "L2", "weight_decay": 5e-9},
    {"type": "EIIE_Output_WithW","regularizer": "L2", "weight_decay": 5e-8}
  ],
  "training":{
    "steps":80000,
    "learning_rate":0.00028,
    "batch_size":109,
    "buffer_biased":5e-5,
    "snap_shot":false,
    "fast_train":true,
    "training_method":"Adam",
    "loss_function":"loss_function6"
  },

  "input":{
    "window_size":31,
    "coin_number":4,
    "global_period":1800,
    "feature_number":3,
    "test_portion":0.08,
    "online":false,
    "start_date":"2017/03/17",
    "end_date":"2018/02/02",
    "volume_average_days":30,
    "coins": ["ETH", "LTC", "reversed_USDT", "BCH"]
  },

  "trading":{
    "trading_consumption":0.0025,
    "rolling_training_steps":85,
    "learning_rate":0.00028,
    "buffer_biased":5e-5
  }
}

The difference is in the period times and me enforcing using explicit set of assets to trade (custom change I made in the code). This is the chart I get: result

By the way - your implementation of UBAH is using hardcoded number of assets.

WojciechMigda commented 6 years ago

Another chart for the same period when the traded assets were automatically selected by the framework (['XRP', 'reversed_USDT', 'ETH', 'STR']) result

WojciechMigda commented 6 years ago

Performance improves after loss_function8 is selected. For the ["ETH", "LTC", "reversed_USDT", "BCH"] selection NNAgent performs only worse than BEST result

ZhengyaoJiang commented 6 years ago

Greetings, Thanks for your effort. Yes, the agent will perform badly in particular hyper-parameter settings. Also, it's nice for you to do further test rather blindly believe the results we posted. However, in practice, no one will pick the parameters deliberately to make the performance worse.

Our whole workflow of the testing is:

  1. Select the best hyperparameters on cross-validation set. For example, on "2017/05/01-2017/07/01"
  2. Using the selected hyperparameters on the test set (close to the cross-validation set). For example, on "2017/07/01-2017/09/01".
  3. report the test results.

The hyper-parameters you reported may let the network to be overfitting since

In an EIIE network, the experiences between different assets are shared, so it's equivalent to reduce 5/6 of training data.

Regards Zhengyao

WojciechMigda commented 6 years ago

Hi, thank you for your response. Yes, you are right, the training and testing ranges are smaller. I made these experiments to see, first, how would NNAgent perform during a particularly bad market conditions, and secondly, how in such circumstances would it compare to benchmark portfolio algorithms. I was surprised to find out that, as the first plot demonstrates, it was lagging behind others. Even though it could fall back to USDT - it didn't.

From further experiments I made it becomes apparent that number of traded assets plays a significant role. With the parameters as above but with a richer selection of assets NNAgent is doing noticeably better.

Regardless of the above, I think if you submitted your paper for a peer review, then one of the first points raised would be about why there is no analysis what are its weak points, e.g. parameter sets or conditions which do not work quite well.

With best, Wojciech

lytkarinskiy commented 6 years ago

Thanks for your analysis, it's good critic point! I also agree that main disadvantage is usage of cash only as buffer value to make transactions between other assets but not using it as backup value for bad market conditions. But in original article it's said that current loss function is not good at all and in future it should be improved or replaced. Also in ticket 33 ZhengyaoJiang wrote that next article is in preparation. Also it's clear why more assets in portfolio is better - because almost all assets are correlated to cash and if one is falling other will also, so choose of more assets is big chance of catching more uncorrelated behaviour in market and use it to make profit. But here we get another problem - not all assets have enough trade volume, so with increasing assets number we automatically decrease allowable trade volume in real world. Anyway I wish to team only success in future improvements of framework!

ghego commented 6 years ago

@ZhengyaoJiang, could you help me understand how to specify train, validation and test intervals? It's not clear to me how you can specify the validation set and the test set in the config file. For example, I understand this config to mean:

"input":{
    "window_size":31,
    "coin_number":4,
    "global_period":1800,
    "feature_number":3,
    "test_portion":0.08,
    "online":false,
    "start_date":"2017/03/17",
    "end_date":"2018/02/02",
    "volume_average_days":30,
    "coins": ["ETH", "LTC", "reversed_USDT", "BCH"]
  },

that we are specifying the whole range to be March, 17, 2017 to February, 2, 2018, and of that we are using the last 8% as validation data.

Is that so? If so, how do you specify the test set?

If not so, and the 8% is the actual test set, how do you specify the validation set? Or is it automatically defined?

In other words, I don't see three intervals (train, validation, test) in the config, but only two. Can you clarify? Thanks.

dexhunter commented 6 years ago

how you can specify the validation set and the test set in the config file In other words, I don't see three intervals (train, validation, test) in the config, but only two. Can you clarify? Thanks.

@ghego Hi! The start_date and end_date include training, validation and test. Actually the validation is included in the hyper-parameter optimization under our private framework (not include at the back-test), in other words there are only train and test for back-test. So to specify the est set, you need to manually calculate the test_portion.

For example, in the current net_config.json. The start is 2015/07/01, the end is 2017/07/01 and the test portion is 0.08. If you plot the result as a table, you will see the back-test start from 2017-05-03 12:00:00 to 2017-07-01 00:00:00 (0.08 of the whole range).

Or you can modify the code to specify.

ghego commented 6 years ago

@DexHunter thanks for the answer. So if I understand correctly the open source version only does train and test but no hyperparam optimization on a validation test. Correct?

dexhunter commented 6 years ago

@ghego yes, however the net_config.json is tuned on internal version.

ghego commented 6 years ago

@DexHunter ok but the results one obtains with the provided net_config.json do not seem to match those of the article nor those of the readme.

Also, do you expect that same net_config.json to provide optimal performance on other windows of time (past, future) or is it important to tune hyperparameters with a validation set each time the period is changed?

dexhunter commented 6 years ago

@DexHunter ok but the results one obtains with the provided net_config.json do not seem to match those of the article nor those of the readme.

You can check statement on the Readme.

Also, do you expect that same net_config.json to provide optimal performance on other windows of time (past, future) or is it important to tune hyperparameters with a validation set each time the period is changed?

From my understanding, for different time ranges, different hyperparameters need to be adopted.