Closed kaixindelele closed 3 years ago
I'm not so sure. The code was written a year ago and I did test them then.
Your question is a little bit opaque. Which environment do you test on? And what's the performance of the agent with per and a uniform replay, respectively?
btw, there is no guarantee that PER is better than a uniform replay. It can perform worse than the uniform one on some environments.
I test in halfcheetah-v2, and the performance in https://blog.csdn.net/hehedadaq/article/details/111600080#_240 It really can perform worse than the uniform one on some environments or with some RL algorithms.
If you refer to the original paper, PER indeed makes things worse in some cases. Overall, it's better than a uniform replay. Moreover, PER introduces several new hyperparameters, which may require further fine-tuning for new environments.
yes, that is so bad for me.
Hi, Merry Christmas! Thank you for sharing the model-free-RL-library. Recently, I've been interesting for PER with continuous RL algorithms. However, I found that the performance of td3+per and sac+per is not good. I don't know whether it's my code problem or whether the these two algorithms with per are really not good. So I'd like to ask help for you about the performance of your codes with PER. Have you ever evaluated them? Thank you!