Closed malikatamne closed 5 years ago
The way I understand it, Policy Gradient is a reinforcment learning approach, but not Q-Learning, since it doesn't estimate a Q value. In this PR, the policy gradient code is inside the q_learning package. Consider making a copy of the q_learning
directory that is called policy_gradient
. That way we can keep them separate and they'll work independently. We could then, in a separate PR, create another package for shared code.
Alternatively, we could keep them in the same directory and rename q_learning
to reinforcement_learning
.
The way I understand it, Policy Gradient is a reinforcment learning approach, but not Q-Learning, since it doesn't estimate a Q value. In this PR, the policy gradient code is inside the q_learning package. Consider making a copy of the
q_learning
directory that is calledpolicy_gradient
. That way we can keep them separate and they'll work independently. We could then, in a separate PR, create another package for shared code. Alternatively, we could keep them in the same directory and renameq_learning
toreinforcement_learning
.
Since separating the 2 packages would require a fair amount of work i would suggest that we rename q_learning into reinforcement_learning as @marian42 suggests. What do you think @malikatamne ?
The way I understand it, Policy Gradient is a reinforcment learning approach, but not Q-Learning, since it doesn't estimate a Q value. In this PR, the policy gradient code is inside the q_learning package. Consider making a copy of the
q_learning
directory that is calledpolicy_gradient
. That way we can keep them separate and they'll work independently. We could then, in a separate PR, create another package for shared code. Alternatively, we could keep them in the same directory and renameq_learning
toreinforcement_learning
.Since separating the 2 packages would require a fair amount of work i would suggest that we rename q_learning into reinforcement_learning as @marian42 suggests. What do you think @malikatamne ?
Yes you are right @marian42, I will rename the directory q_learning
to reinforcement_learning
.
Created a new launch file for the policy gradient approach. Formatted the code.