-
-
Not something I'd do for LZ right now, but an interesting idea to push things further:
Someone on the Leela Chess Zero list asked:
>Hello, I still can not understand the reason for using both Po…
-
I was trying to reproduce this work, but I encountered a problem with No module named 'env'. I checked the .gitignore file and found that it contained env/. I guess the error was caused here.
(…
-
### Describe the problem
The results for R2D2 are quite good: https://openreview.net/forum?id=r1lyTjAqYX
We should add this as a variant of Ape-X DQN that supports recurrent networks. The high-l…
ericl updated
4 months ago
-
The reward plot as shown above is decreasing rather than increasing over time. Could be due to hyperparameters chosen, or how the state features are preprocessed.
Some ideas to try:
- Preproc…
-
Hi Denny,
Thanks for this wonderful resource. It's been hugely helpful. Can you say what your results are when training the DQN solution? I've been unable to reproduce the results of the DeepMind p…
-
Hi. Thank you for everything. This is great. I would like to ask if I can change the TD3 algorithm to other algorithms based on your work, such as DQN, DDQN and so on. I don't know much about this. I …
-
## Describe the bug
For an academic project, I wanted to compare few versions of DQN :
- Vanilla DQN
- DQN with a target network
- Double DQN (therefore with a target network)
By looking into…
Arlaz updated
11 months ago
-
**Describe the bug**
A brief description of the bug and in which notebook/script it lives.
04_q_learning_for_trading
Train Agent
DDQNAgent.experience_replay()
q_values[[self.idx, action…
-
I tried to run a test case (e.g. /workspaces/mpc-drl-tl/testcases/gym-environments/single-zone/test_action_v1/test_ddqn_tianshou.py) using cpu3 image, but it gave an error message on permission as fol…