samre12 / deep-trading-agent

Deep Reinforcement Learning based Trading Agent for Bitcoin
MIT License
751 stars 211 forks source link

Nan in summary histogram for: q_network/avg_q_summary/q/2 #13

Open cTatu opened 6 years ago

cTatu commented 6 years ago

Hi, I tried using another datasets, one with a forex pair and another one with the BTC/USD pair but it gives me that error for every dataset. With the original dataset (btc.csv) this error doesn't happen, and the other ones has the same exact column names and data types. The only thing I did was changing the config file to take this new dataset.

Here's a look of my custom BTC dataset.

DateTime_UTC Timestamp price_high price_low price_close price_open volume
2017-08-05 00:00:00 1501891200 1601.0 1554.0 1592.0 1555.0 304.0
2017-08-05 00:01:00 1501891260 1592.0 1591.0 1591.0 1592.0 248.0

Also, the people on the internet say that I should decrease the learning rate, so I did, but the problem persist. I also tried messing around with the parameters from model/baseagent.py but same. I ask here because I tried everything, including changing the order of the colums in the dataset.

Thank you!

0%|                   | 49900/100000000 [00:15<8:40:07, 3202.78it/s]Traceback (most recent call last):
  File "main.py", line 38, in <module>
    main(vars(args)['file_path'])
  File "main.py", line 29, in main
    agent.train()
  File "/deep-trading-agent/code/model/agent.py", line 71, in train
    self.observe(screen, reward, action, terminal, trade_rem)
  File "/deep-trading-agent/code/model/agent.py", line 170, in observe
    self.q_learning_mini_batch()
  File "/deep-trading-agent/code/model/agent.py", line 204, in q_learning_mini_batch
    self.learning_rate_step: self.step
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 982, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1032, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1052, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Nan in summary histogram for: q_network/avg_q_summary/q/2
         [[Node: q_network/avg_q_summary/q/2 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](q_network/avg_q_summary/q/2/tag, q_network/avg_q_summary/strided_slice_2)]]

Caused by op u'q_network/avg_q_summary/q/2', defined at:
  File "main.py", line 38, in <module>
    main(vars(args)['file_path'])
  File "main.py", line 28, in main
    agent = Agent(sess, logger, config, env)
  File "/deep-trading-agent/code/model/agent.py", line 44, in __init__
    self.build_dqn(params)
  File "/deep-trading-agent/code/model/agent.py", line 237, in build_dqn
    self.q.build_model((self.s_t, self.trade_rem_t))
  File "/deep-trading-agent/code/model/deepsense.py", line 205, in build_model
    self._avg_q_summary.append(tf.summary.histogram('q/{}'.format(idx), avg_q[idx]))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/summary/summary.py", line 209, in histogram
    tag=scope.rstrip('/'), values=values, name=scope)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_logging_ops.py", line 139, in _histogram_summary
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Nan in summary histogram for: q_network/avg_q_summary/q/2
         [[Node: q_network/avg_q_summary/q/2 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](q_network/avg_q_summary/q/2/tag, q_network/avg_q_summary/strided_slice_2)]]
samre12 commented 6 years ago

I believe you are using scale as 10000. I have noticed that this error occurs only when the agent is not learning - that is, taking random actions to fill the replay memory, as soon as the agent starts learning, this problem never occurs. Moreover, the episodes from the dataset are selected randomly, so I was not able to always produce the issue implying that the cause is more likely in the dataset used to generate the episodes. I have been actively working to resolve the issue and will commit the corrected code as soon as I find the bug.

cTatu commented 6 years ago

@samre12 I found the problem. It's the dataset, all the data has to have a non zero decimal. The only thing I did was to add a random number between 0 and 1 (both not included) to every price and volume entry and now it works. Example:

DateTime_UTC Timestamp price_high price_low price_close price_open volume
2017-08-05 00:00:00 1501891200 1601.0 1601.5 1554.0 1554.2 1592.0 1592.4 1555.0 1555.8 304.0 304.7
samre12 commented 6 years ago

@cTatu I don't know how would a zero decimal value affect the model. Did you try running the model on different data points with zero decimal values?

samre12 commented 6 years ago

Actually many a times I also have the same error while running the code on my dataset as well. But I do not always get the error due to random selection of the episode. @cTatu do you have an idea about what could be the probable cause?

cTatu commented 6 years ago

I tried also with another dataset (EUR/USD) and gives the same error even with the workaround that I've mention early. In this dataset all the prices fluctuates betweetn 1.2 and 1.8 so the only idea that I have is that it may be due to little differences in values between episodes. Maybe that is causing a very small chage which is interpreted as 'not learning'. I didn't have enough time to look all the model's code, what I said is my very basic and general idea, I'm still have no idea what is going on. It's the first time that I see a model which is not learning because dataset fault.

The only thing that I see weird ( a little bit off topic) is that the agent choose random episodes from the dataset. I mean, the price of an asset is a time series structure, and every entry on the dataset is influeced by the past prices due to fractal laws and Elliott Wave Theory. So if the episodes are choosen randomly, how would the model be able to sustain a real-time trading system? Furthermore, I think that if agent's actions are determined by his historical 'experience' and all that episodes are randomly mixed then he only would have a very restricted point of view of the whole market or timeline. All the market prediction/trading models that I saw used a LSTM which is the most suitable for the time-series problems. This is a new approach for me, I will search more details about it on the Deep Q-Trading.

samre12 commented 6 years ago

i@cTatu I am sorry for the initial vague description. The aim is to train the agent to look at limited number of historical prices and base its decisions of trading upon it. By random episode, I mean a random starting point in the time series along with the historical prices prior to the starting point. Also as you correctly pointed out, I am in fact using a LSTM to consume the historical prices and generate a suitable state representation for the agent.

SdxHex commented 6 years ago

Same bug... I tried changing the scale to 10000,1000, 5000, 30000 same thing. This is using the DEV version. Python 2.7 TF 1.7

samre12 commented 6 years ago

@SdxHex thanks for pointing that out but I am aware of the bug in the dev branch as well and changing the scale won't avoid it. It is primarily generated due to the dataset since I don't always run into this error because of random episode selection.

SdxHex commented 6 years ago

Thanks I made a couple of changes to the way the data is pulled seems to be working great! I think I read that you are not accepting pull requests? Thanks for putting this together.

samre12 commented 6 years ago

@SdxHex I am very open to accept pull requests 😄 ! Where did you read that I am not accepting 😛? It would be highly appreciated if you raise one as I am currently spending more time towards how to incorporate more technical indicators in the input and normalise them before feeding into the network. I am really sorry but I could not understand your above comment as in have you made changes to remove Nan in summary histograms?