issues
search
dennybritz
/
reinforcement-learning
Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
http://www.wildml.com/2016/10/learning-reinforcement-learning/
MIT License
20.23k
stars
6k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
MC Control with Epsilon-Greedy Policies ---Epsilon Value and Best Action prob error
#252
hardik-kansal
opened
6 months ago
1
create new file
#251
iw4p
closed
11 months ago
0
demystifying-deep-reinforcement-learning link is broken
#250
kiankyars
opened
1 year ago
0
Create index.html
#249
dkavargy
closed
1 year ago
1
Update README.md
#248
pajjaecat
opened
1 year ago
0
please provide requirements.txt or mention the exact version of packages used.
#247
Nahdus
opened
1 year ago
0
Issue in: reinforcement-learning/MC/MC Prediction Solution.ipynb
#246
Almujtaba-Yaseen
opened
1 year ago
0
Fixed compatibility with current version of OpenAI gym without DiscreteEnv class
#245
arielsboiardi
closed
1 year ago
1
Typo in: "Model-Free Prediction & Control with Monte Carlo (MC)" section -> "Blackjack Playground.ipynb" file:
#244
Almujtaba-Yaseen
opened
1 year ago
0
A small correction in "MDPs and Bellman Equations" section
#243
Almujtaba-Yaseen
opened
1 year ago
0
Modify "v (list) : state value function" to "V"
#242
hslyu
opened
2 years ago
0
Hello
#241
simplephi
opened
2 years ago
0
Update README.md
#240
hardlyhuman
opened
2 years ago
0
Minor Link fix
#239
gitDawn
opened
2 years ago
0
Reinforcement learning policy
#238
Comp-Engr18
opened
2 years ago
1
Error 'show() takes 1 positional argument but 2 were given' fixed in plotting.py
#237
Dolores2333
opened
3 years ago
0
DQN Testing Rewards on Atari Games
#236
willtop
closed
3 years ago
1
Clarification on DQN testing rewards on Atari games
#235
willtop
opened
3 years ago
0
Minor fixes
#234
rafardenas
opened
3 years ago
0
update slides
#233
harsh306
opened
3 years ago
1
Lecture Slides need an update
#232
harsh306
opened
3 years ago
0
Monte Carlo AssertionError: defaultdict(<function mc_control_importance_sampling.<locals>.<lambda> at 0x7f31699ffe18>, {}) (<class 'collections.defaultdict'>)
#231
NC25
opened
3 years ago
0
Update DP exercise policy evaluation solution
#230
ugrkm
closed
3 years ago
0
Policy Evaluation Exercise Solution Is Wrong
#229
ugrkm
closed
3 years ago
1
Delete __init__.py
#228
dtlics
closed
3 years ago
1
DQL size error
#227
johan606303
opened
4 years ago
0
added: Double DQN Proportional Prioritized Experience Replay Solution
#226
makaveli10
opened
4 years ago
2
added: DoubleDQN Proportional Prioritized Replay solution
#225
makaveli10
closed
4 years ago
0
Some question in MC Control with Epsilon-Greedy Policies Solution.ipynb
#224
josephbak
closed
4 years ago
2
Gambler's Problem: 0 Stake Allowed?
#223
mparigi
opened
4 years ago
0
why DQN use kernel size 8 ?
#222
opentld
opened
4 years ago
0
Why is Chapter 11 excluded?
#221
BedirT
opened
4 years ago
2
Is a line missing in 'MC Control with Epsilon-Greedy Policies Solution.ipynb'?
#220
Ritz111
opened
4 years ago
1
Why CliffWalkingEnv returns 'is_done=True' when reaching cliff?
#219
wakamori
closed
4 years ago
2
Can an agent learn valid actions offline, being able to choose only actions that were already taken (e.g. from historical data) ? [question]
#218
VieVaWaldi
opened
4 years ago
6
Deep Q Learning, neither works with tensorflow 1.x nor with tensorflow 2.x
#217
azharsalman
opened
4 years ago
1
Update README.md
#216
roshray
closed
4 years ago
1
Mdp branch
#215
csxiang18
closed
4 years ago
0
a test pull req (corrected few typos)
#214
nsydn
closed
4 years ago
1
Could anyone show me reason why use 4 same grayscale frames when training DQN?
#213
roachsinai
closed
4 years ago
1
Policy iteration solution only show 1 optimal solution
#212
duongnhatthang
opened
4 years ago
2
log
#208
Mahsa-Bastankhah
opened
4 years ago
0
Exercise notebooks with no outputs.
#207
avullo
opened
4 years ago
0
Add Links to Deepnote
#206
jirkalhotka
opened
4 years ago
0
Test the policy in "Value Iteration" exercise
#205
link2xt
opened
5 years ago
1
Provided policy_improvement() solution initializes values to zero for each iteration
#204
link2xt
opened
5 years ago
2
Provided policy_improvement() solution is not guaranteed to terminate
#203
link2xt
opened
5 years ago
1
policy_improvement() should be renamed to policy_iteration()
#202
link2xt
opened
5 years ago
0
Update CliffWalk REINFORCE with Baseline Solution.ipynb
#201
guotong1988
closed
5 years ago
1
Vanilla REINFORCE implementation
#200
alek5k
opened
5 years ago
2
Next