I suggest you make the following changes to chapter4 gamblers_problem to show multiple best actions:
x_axis = []
y_axis = []
......
line 63
max_values = np.where(np.round(action_returns[1:], 5)==np.amax(np.round(action_returns[1:], 5)))[0]+1
x_axis.extend([state]*len(max_values))
for i in max_values:
y_axis.extend([actions[i]])
I suggest you make the following changes to chapter4 gamblers_problem to show multiple best actions:
x_axis = [] y_axis = [] ......
line 63
...... plt.scatter(x_axis, y_axis)