ShangtongZhang / reinforcement-learning-an-introduction

Python Implementation of Reinforcement Learning: An Introduction
MIT License
13.54k stars 4.82k forks source link

chapter4 gamblers_problem, showing multiple best actions #158

Open itschenxi opened 1 year ago

itschenxi commented 1 year ago

I suggest you make the following changes to chapter4 gamblers_problem to show multiple best actions:

x_axis = [] y_axis = [] ......

line 63

    max_values = np.where(np.round(action_returns[1:], 5)==np.amax(np.round(action_returns[1:], 5)))[0]+1
    x_axis.extend([state]*len(max_values))
    for i in max_values:
        y_axis.extend([actions[i]])

...... plt.scatter(x_axis, y_axis)