-
Dear Owen,
Thanks for your implementation and the video for RL_VQC, which is a great job. The paper showed that RL_VQC could work well for MountainCar and Acrobot environments. I modified your code…
-
For learning purposes I am tuning a number of algorithms for environment 'MountanCar-v0'. At the moment I am interested in PPO. I intend to share tuned hyperparameters working putting them on your rep…
-
I have upgraded gym to it's latest version.
When I run this code
```python
import gym
env = gym.make('MountainCar-v0')
print(env.action_space.shape)
```
I get the output
```python
()
…
-
`import gym
env = gym.make("MountainCar-v0")
env.reset()
done = False
while not done:
action = 2 # always go right!
new_state, reward, done, _ = env.step(action)
env.render()
…
-
### 📚 Documentation
A clear and concise description of what should be improved in the documentation:
I have accessed the documentation [here](https://stable-baselines3.readthedocs.io/_/downloads…
-
C:\Python\Python37\lib\site-packages\keras_rl-0.4.2-py3.7.egg\rl\agents\dqn.py in __init__(self, model, policy, test_policy, enable_double_dqn, enable_dueling_network, dueling_type, *args, **kwargs)
…
-
Thanks for sharing this implementation.
I have a question regarding reward update in Q-learning.
Why do you use a modified reward here:
https://github.com/Pechckin/MountainCar/blob/6754a33eba78ca…
-
We should write a more detailed explanation of every environment, in particular, how the reward function is computed.
-
Solve & take position
-
This isn't a bug or anything like that, but I wonder if anyone could point me in the right direction.
One can do this:
`python -m train.py --algo ppo2 --env MountainCar-v0 -n 50000 -optimize --n…