Closed kkshyu closed 8 years ago
I agree that chooseAction
may be more appropriate so as to avoid the confusion with the method in Environment
(however I don't think chooseBestAction
would be appropriate because the policies have to deal with the exploration/exploitation dilemma which is done within that method).
Did you mean chooseBestAtction
have already dealt with exploration/exploitation dilemma?
I thought the dilemma should be implemented in act
(chooseAction
).
In my opinion, the method bestAction
in policies is similar with chooseBestAction
in q-network. Actually I don't understand the purpose of this method.
Ok, I think I misunderstood your first message. Either we have act
--->chooseAction
and bestAction
--->chooseBestAction
or we keep bestAction
and we change act
--->action
. (of course, chooseBestAction
or bestAction
has already dealt with exploration/exploitation)
And indeed bestAction
in policies returns currently directly chooseBestAction
in q-network. It's an additional encapsulation but if it's not required in the future, the best is probably to remove it, indeed.
Thanks for the feedback! Either you can do a PR or I'll do the changes.
Environment
andPolicy
both containact
method, but they do quite different things. In my opinion,act
is a verb to perform sth. Therefore, inPolicy
abstract class, it should be a nounaction
just likebestAction
. However,chooseAction
andchooseBestAction
are good, too.