Closed martin0 closed 5 months ago
Looking a bit deeper (no pun intended). When creating the environment, I got a warning about which perhaps is relevant.
Cell
trading_environment = gym.make('trading-v0', ticker='AAPL', max_episode_steps=trading_days, trading_days=trading_days, trading_cost_bps=trading_cost_bps, time_cost_bps=time_cost_bps) trading_environment.seed(42)
Output
`INFO:gymenvs.machine_learning_for_trading.trading_env:gymenvs.machine_learning_for_trading.trading_env logger started.
INFO:gymenvs.machine_learning_for_trading.trading_env:loading data for AAPL...
INFO:gymenvs.machine_learning_for_trading.trading_env:got data for AAPL...
INFO:gymenvs.machine_learning_for_trading.trading_env:None
<class 'pandas.core.frame.DataFrame'>
MultiIndex: 9367 entries, (Timestamp('1981-01-30 00:00:00'), 'AAPL') to (Timestamp('2018-03-27 00:00:00'), 'AAPL')
Data columns (total 10 columns):
0 returns 9367 non-null float64
1 ret_2 9367 non-null float64
2 ret_5 9367 non-null float64
3 ret_10 9367 non-null float64
4 ret_21 9367 non-null float64
5 rsi 9367 non-null float64
6 macd 9367 non-null float64
7 atr 9367 non-null float64
8 stoch 9367 non-null float64
9 ultosc 9367 non-null float64
dtypes: float64(10)
memory usage: 1.5+ MB
/usr/local/lib/python3.10/dist-packages/gym/core.py:317: DeprecationWarning: WARN: Initializing wrapper in old step API which returns one bool instead of two. It is recommended to set new_step_api=True
to use new step API. This will be the default behaviour in future.
deprecation(
/usr/local/lib/python3.10/dist-packages/gym/wrappers/step_api_compatibility.py:39: DeprecationWarning: WARN: Initializing environment in old step API which returns one bool instead of two. It is recommended to set new_step_api=True
to use new step API. This will be the default behaviour in future.
deprecation(
[42]`
piplist.txt Listing from command Google Colab command !pip list
Some debug information Some reason, I can't add new comment without "Close with comment"??
Some debug info just before the line
q_values[[self.idx, actions]] = targets
ipdb> len(rewards) 4096 ipdb> len(not_done) 4096 ipdb> self.gamma (0.99,) ipdb> target_q_values <tf.Tensor: shape=(4096,), dtype=float32, numpy= array([ 0.05242079, -0.00775741, -0.23272878, ..., 0.01323538, -0.13362771, -0.28871116], dtype=float32)> ipdb> states.shape (4096, 10) ipdb> q_values.shape (4096, 3) ipdb> actions array([0, 0, 2, ..., 0, 1, 1]) ipdb> len(actions) 4096 ipdb> self.idx <tf.Tensor: shape=(4096,), dtype=int32, numpy=array([ 0, 1, 2, ..., 4093, 4094, 4095], dtype=int32)>
Thought I would try the preceeding notebook, but it has this comment...
See the notebook 04_q_learning_for_trading.ipynb for instructions on upgrading TensorFlow to version 2.2, required by the code below.. I see no information about upgrading to Tensorflow version 2.2. Is that the problem?
Here is a link to a gpt4 conversation I had about the issue. I haven't tried the suggestion yet. https://chat.openai.com/share/5469961d-7b58-4f8e-a176-a937ae132a3f
(bard said it was beyond its capabilities at present, even though Google demonstrated competitive coding with gemini last month)
Following ChatGPT suggestion 1 (see above)
the following mods to experience_replay() seems to result in the ability to run the model :-)
# Reshape targets to be a 2D array with the same second dimension as q_values
targets_np = targets.numpy().reshape(-1, 1)
# Create a mask for the actions taken
mask = tf.one_hot(actions, self.num_actions)
# Update the q_values for the actions taken
q_values = q_values * (1 - mask) + targets_np * mask
Describe the bug A brief description of the bug and in which notebook/script it lives. 04_q_learning_for_trading Train Agent DDQNAgent.experience_replay() q_values[[self.idx, actions]] = targets ValueError: shape mismatch: value array of shape (4096,) could not be broadcast to indexing result of shape (2,4096,3)
Output is (approximately 35 steps) 10 | 00:00:03 | Agent: -38.6% (-38.6%) | Market: 4.6% ( 4.6%) | Wins: 20.0% | eps: 0.960 (approximately 77 steps) `ValueError Traceback (most recent call last) in <cell line: 3>()
13 0.0 if done else 1.0)
14 if ddqn.train:
---> 15 ddqn.experience_replay()
16 if done:
17 break
ValueError: shape mismatch: value array of shape (4096,) could not be broadcast to indexing result of shape (2,4096,3)`
To Reproduce
self.observation_space = spaces.Box(self.data_source.min_values.to_numpy(), self.data_source.max_values.to_numpy())
Running on Google Colab (where session restarts wipe out installed talib package) I am attaching the modified notebook, talib wheel and underlying talib libraries. I installed the talib package using instructions from another notebook "Install Ta-lib on Google colab" (also attached, with modifications to save pieces so a full rebuild is not necessary each new session) Copy of Install Ta-lib on Google colab.zip
For the extra things, I have a python folder and a data folder in the root of my Google Drive. talib_wheel_and_lib_and_config.zip 04_q_learning_for_trading.zip
The python folder also includes the gym environments specifically the one related to this notebook directory python/gymenvs/machine_learning_for_trading should hold the contents of this zip file trading_env.zip
Steps to reproduce the behavior:
Create necessary data as per instructions chapter 2 of the book Run the notebook
Expected behavior A clear and concise description of what you expected to happen. Expect the 04_q_learning_from_trading notebook to run as designed by the author (please)
Screenshots If applicable, add screenshots to help explain your problem.
Environment If you are not using the latest version of the Docker imag: Google Colab using T4 GPU
Additional context Add any other context about the problem here. Given the mod I had to make for observation_space, here is a runtime value of the observation_space Box( [ -0.5186916 -13.186786 -9.157841 -6.9791217 -5.2897873 -1.5290436 -5.4077215 -0.6155895 -2.762308 -3.9641087], [ 0.3321519 11.431712 10.235379 9.135829 8.238228 1.4996951 5.7050333 5.4152718 2.7126348 2.7631414], (10,), float32)
I will continue to troubleshoot the problem, and post further updates I find. Happy to answer any questions to clarify this post.
Thanks Martin