Closed jsphon closed 7 years ago
The current generate experience function uses a policy of greedy reward following.
Update the class to allow it to use epsilon greedy policy.
The current generate experience function uses a policy of greedy reward following.
Update the class to allow it to use epsilon greedy policy.