ShangtongZhang reinforcement-learning-an-introduction issues

ShangtongZhang / reinforcement-learning-an-introduction

Python Implementation of Reinforcement Learning: An Introduction

MIT License

13.45k stars 4.81k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Fix usable_ace_player bug, fix indention error, set POLICY_PLAYER dty…

#115 goal closed 5 years ago
1
How to formulate problem with State is a combination of multiple factors?

#114 MJeremy2017 closed 5 years ago
1
Chapter 4：seems missing self. before TRUNCATE

#113 ZiqiChai closed 5 years ago
1
Chapter 2: reset time

#112 sursu closed 5 years ago
2
Pythonic edits

#111 billtubbs closed 5 years ago
1
Choosing the best action when identical

#110 sursu closed 5 years ago
0
modification for chap04

#109 wlbksy closed 5 years ago
1
simplify update equations respect to the book

#108 wlbksy closed 5 years ago
1
pythonic for chap01

#107 wlbksy closed 5 years ago
1
Fixed few minor issues in chapter 1 tic_tac_toe:

#106 ainilaha closed 5 years ago
1
Fixed epsilon value for exploration

#105 abhinavsagar closed 5 years ago
2
epilon not initialized

#104 abhinavsagar closed 5 years ago
1
Maybe a little bug in chapter5 blackjack.py function 'play' line 81-85

#103 Huixxi closed 5 years ago
1
Policy evaluation with backed up value function.

#102 tahsinkose closed 5 years ago
1
Chapter 09: Random Walk 100

#101 xenomeno closed 5 years ago
1
Question about batch_updating function in chapter06/random_walk.py

#100 hitblackjack closed 5 years ago
1
Would it be OK to publish solutions to the programming exercises alongside mainly the algorithms I intend to implement from the book?

#99 brancoliticus closed 5 years ago
1
Missing parameter description for true_reward

#98 michaelshiyu closed 5 years ago
2
Made the epsilon-greedy bandit algorithm break ties at random.

#97 michaelshiyu closed 5 years ago
2
Just a Thank you note

#96 wassimseif closed 5 years ago
0
Chapter01 - Fix lint messages, add parameter to reduce frequency of logging

#95 VVKot closed 5 years ago
1
Why do not use true online Sarsa(λ) in figure 12.11

#94 xingE650 closed 5 years ago
1
Chapter 4 jacks car rental

#93 HareshKarnan closed 5 years ago
2
_

#92 hitblackjack closed 5 years ago
1
Problem I meet in how TD method and MC method update the last state-value in a MRP

#91 xingE650 closed 5 years ago
1
Fix the Blackjack dynamics to correctly handle receiving an ace while having a usable ace already.

#90 kevindoran closed 5 years ago
1
chapter2_content.tex exercise 2.3 问题

#89 RocStone closed 5 years ago
1
action index should offset by one

#88 barcahead closed 5 years ago
1
Some revision suggestions in Maximization_bias's Problem

#87 LBAWMY closed 5 years ago
1
Some revision suggestions in Maximization_bias's Problem

#86 LBAWMY closed 5 years ago
1
Add docker files to configure runtime eonvironment

#85 YangyangFu closed 5 years ago
2
Q-learning Example Has No @expected

#84 LinaeSostra closed 6 years ago
1
break ties in Gambler's Problem

#83 hansweytjens closed 6 years ago
1
Question about gradient in differential semi-gradient Sarsa

#82 HusseinAlmulla closed 6 years ago
5
Chapter 04: CarRental.py - suggestions for realRentalFirst/SecondLoc fix

#81 ychong closed 6 years ago
2
Chapter 8: Backup updates for Prioritized Sweeping vs Dyna-Q

#80 xenomeno closed 6 years ago
4
CHAPTER1 ,TicTacToe.py: Purpose of reshape function?

#79 pk97 closed 6 years ago
1
Chapter 3: GridWorld

#78 ychong closed 6 years ago
2
Chapter 5: Monte Carlo ES initial policy

#77 jerome-white closed 6 years ago
5
Chapter 13, REINFORCE

#76 sergii-bond closed 6 years ago
1
Add in place version of Chapter 4

#75 JustinNie closed 6 years ago
1
Chapter4 - Suggestion

#74 JustinNie closed 6 years ago
1
add example 13.1

#73 sergii-bond closed 6 years ago
1
Chapter 6: Random Walk --> Infinite loop

#72 xenomeno closed 6 years ago
1
One bug on the MountainCar.py in the folder Chapter12

#71 MathematicalModels closed 6 years ago
1
chapter5

#70 tinglo closed 6 years ago
5
question about implementation of dealer's part in blackjack.py

#69 shining-spring closed 6 years ago
1
Policy evaluation for GridWorld issue #67

#68 cbrom closed 6 years ago
0
Policy evaluation for GridWorld

#67 cbrom closed 6 years ago
4
chapter13

#66 liiiiiiiiil closed 6 years ago
3

Previous Next