Open IverYangg opened 2 years ago
using the code can trains a lot of policies, but how to choose the best one? in other words, what kind of standard can be use to judge a trained policy? thanks!
using the code can trains a lot of policies, but how to choose the best one? in other words, what kind of standard can be use to judge a trained policy? thanks!