No action at all (zero output) when evaluating a training result.

applebull commented 6 years ago

It is an interesting project, and I tried to it on my computer based on your readme. This is what I did. mkdir models python train.py ^GSPC 20 100 python evaluate.py ^GSPC_2011 model_ep100 And I got following output in evaluation

/Users/username/Library/Python/2.7/lib/python/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from 'float' to 'np.floating' is deprecated. In future, it will be treated as 'np.float64 == np.dtype(float).type'. from ._conv import register_converters as _register_converters Using TensorFlow backend. 2018-01-17 07:29:54.049861: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA

^GSPC_2011 Total Profit: $0.00

The agent did not do any thing to the test data set... I know 100 training episodes is not enough to produce meaningful result. But I expect insufficient training would yield some bad strategy to loss money rather than no action at all.

My OS is macOS High Sierra. Do you think it is the problem of python environment or just too few training? Have you had such problem before? Thanks!

Fishtang01 commented 6 years ago

Hi applebull, did you just the exact code or do you have to modify a few things?

thanks.

edwardhdlu commented 6 years ago

I've found that the model isn't very stable. You can try evaluating the model with different saved instances like model_ep90, ep80 etc and there's some luck involved.

I think the "no action" occurs when the model determines that selling starts at and remains the optimal strategy, but it isn't able to sell because nothing has been bought. I'll see if I can find another way to implement this constraint.

You can also try changing the experience replay to random sampling by changing the first four lines of expReplay to mini_batch = random.sample(self.memory, batch_size).

lamhk commented 6 years ago

Hi Edward, I have the same problem and tried to use the above changes "mini_batch = random.sample(self.memory, batch_size)." but still got zero output.... Any idea? Thanks in advance.

smileyung commented 6 years ago

the same problem with no result!

python evaluate.py ^GSPC_2011 model_ep1000 /home/smilewater/anaconda3/envs/tensorflow-gpu/lib/python3.6/site-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters Using TensorFlow backend. 2018-07-22 21:54:45.515813: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2018-07-22 21:54:45.515834: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2018-07-22 21:54:45.515840: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2018-07-22 21:54:45.515844: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2018-07-22 21:54:45.515848: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 2018-07-22 21:54:45.607514: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-07-22 21:54:45.607885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate (GHz) 1.683 pciBusID 0000:02:00.0 Total memory: 10.91GiB Free memory: 10.75GiB 2018-07-22 21:54:45.607912: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 2018-07-22 21:54:45.607917: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0: Y 2018-07-22 21:54:45.607924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)

^GSPC_2011 Total Profit: $0.00

xtr33me commented 6 years ago

I actually found out about this repo via Siraj's video which was the original I forked from. That said, I did issue a pull request to his. As stated above, you just need to issue a single buy which I do upon entry only and then allow the model to predict the rest of the way. I may look at performing a purchase when the local minimum has been reached in some window but currently I just use a bool for first iteration. I only trained up to 200 epochs, so I still have some more testing but it seems to be working descent. I will issue a pull request after some more testing is done, but should you wish to see changes, my repo is here: https://github.com/xtr33me/Reinforcement_Learning_for_Stock_Prediction

I also had to modify Sigmoid to allow for larger numbers and math.exp overflow issues. Unsure if this will help anyone, but got me moving forward again.

edwardhdlu / q-trader

No action at all (zero output) when evaluating a training result. #1

^GSPC_2011 Total Profit: $0.00

^GSPC_2011 Total Profit: $0.00