Closed ai4pharma closed 4 years ago
There could be a tie.
As shown below, Line 74 always choose the first element by [0]
when there is a tie. The input to np.random.choice
is an int instead of an array.
np.random.choice(np.where(self.q_estimation == q_best)[0])
You can set a breakpoint and run the code to see what happens.
Why there are still
np.random.choice
in Line 66/74? It should be a definite number from argmax. Thanks.