dennybritz reinforcement-learning issues

dennybritz / reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

http://www.wildml.com/2016/10/learning-reinforcement-learning/

MIT License

20.23k stars 6k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

MC Control with Epsilon-Greedy Policies ---Epsilon Value and Best Action prob error

#252 hardik-kansal opened 6 months ago
1
create new file

#251 iw4p closed 11 months ago
0
demystifying-deep-reinforcement-learning link is broken

#250 kiankyars opened 1 year ago
0
Create index.html

#249 dkavargy closed 1 year ago
1
Update README.md

#248 pajjaecat opened 1 year ago
0
please provide requirements.txt or mention the exact version of packages used.

#247 Nahdus opened 1 year ago
0
Issue in: reinforcement-learning/MC/MC Prediction Solution.ipynb

#246 Almujtaba-Yaseen opened 1 year ago
0
Fixed compatibility with current version of OpenAI gym without DiscreteEnv class

#245 arielsboiardi closed 1 year ago
1
Typo in: "Model-Free Prediction & Control with Monte Carlo (MC)" section -> "Blackjack Playground.ipynb" file:

#244 Almujtaba-Yaseen opened 1 year ago
0
A small correction in "MDPs and Bellman Equations" section

#243 Almujtaba-Yaseen opened 1 year ago
0
Modify "v (list) : state value function" to "V"

#242 hslyu opened 2 years ago
0
Hello

#241 simplephi opened 2 years ago
0
Update README.md

#240 hardlyhuman opened 2 years ago
0
Minor Link fix

#239 gitDawn opened 2 years ago
0
Reinforcement learning policy

#238 Comp-Engr18 opened 2 years ago
1
Error 'show() takes 1 positional argument but 2 were given' fixed in plotting.py

#237 Dolores2333 opened 3 years ago
0
DQN Testing Rewards on Atari Games

#236 willtop closed 3 years ago
1
Clarification on DQN testing rewards on Atari games

#235 willtop opened 3 years ago
0
Minor fixes

#234 rafardenas opened 3 years ago
0
update slides

#233 harsh306 opened 3 years ago
1
Lecture Slides need an update

#232 harsh306 opened 3 years ago
0
Monte Carlo AssertionError: defaultdict(<function mc_control_importance_sampling.<locals>.<lambda> at 0x7f31699ffe18>, {}) (<class 'collections.defaultdict'>)

#231 NC25 opened 3 years ago
0
Update DP exercise policy evaluation solution

#230 ugrkm closed 3 years ago
0
Policy Evaluation Exercise Solution Is Wrong

#229 ugrkm closed 3 years ago
1
Delete __init__.py

#228 dtlics closed 3 years ago
1
DQL size error

#227 johan606303 opened 4 years ago
0
added: Double DQN Proportional Prioritized Experience Replay Solution

#226 makaveli10 opened 4 years ago
2
added: DoubleDQN Proportional Prioritized Replay solution

#225 makaveli10 closed 4 years ago
0
Some question in MC Control with Epsilon-Greedy Policies Solution.ipynb

#224 josephbak closed 4 years ago
2
Gambler's Problem: 0 Stake Allowed?

#223 mparigi opened 4 years ago
0
why DQN use kernel size 8 ?

#222 opentld opened 4 years ago
0
Why is Chapter 11 excluded?

#221 BedirT opened 4 years ago
2
Is a line missing in 'MC Control with Epsilon-Greedy Policies Solution.ipynb'?

#220 Ritz111 opened 4 years ago
1
Why CliffWalkingEnv returns 'is_done=True' when reaching cliff?

#219 wakamori closed 4 years ago
2
Can an agent learn valid actions offline, being able to choose only actions that were already taken (e.g. from historical data) ? [question]

#218 VieVaWaldi opened 4 years ago
6
Deep Q Learning, neither works with tensorflow 1.x nor with tensorflow 2.x

#217 azharsalman opened 4 years ago
1
Update README.md

#216 roshray closed 4 years ago
1
Mdp branch

#215 csxiang18 closed 4 years ago
0
a test pull req (corrected few typos)

#214 nsydn closed 4 years ago
1
Could anyone show me reason why use 4 same grayscale frames when training DQN?

#213 roachsinai closed 4 years ago
1
Policy iteration solution only show 1 optimal solution

#212 duongnhatthang opened 4 years ago
2
log

#208 Mahsa-Bastankhah opened 4 years ago
0
Exercise notebooks with no outputs.

#207 avullo opened 4 years ago
0
Add Links to Deepnote

#206 jirkalhotka opened 4 years ago
0
Test the policy in "Value Iteration" exercise

#205 link2xt opened 5 years ago
1
Provided policy_improvement() solution initializes values to zero for each iteration

#204 link2xt opened 5 years ago
2
Provided policy_improvement() solution is not guaranteed to terminate

#203 link2xt opened 5 years ago
1
policy_improvement() should be renamed to policy_iteration()

#202 link2xt opened 5 years ago
0
Update CliffWalk REINFORCE with Baseline Solution.ipynb

#201 guotong1988 closed 5 years ago
1
Vanilla REINFORCE implementation

#200 alek5k opened 5 years ago
2