This paper develops a novel rating-based reinforcement learning approach that
uses human ratings to obtain human guidance in reinforcement learning.
Different from the existing preference-based and ranking-based reinforcement
learning paradigms, based on human relative preferences over sample pairs, the
proposed rating-based reinforcement learning approach is based on human
evaluation of individual trajectories without relative comparisons between
sample pairs. The rating-based reinforcement learning approach builds on a new
prediction model for human ratings and a novel multi-class loss function. We
conduct several experimental studies based on synthetic ratings and real human
ratings to evaluate the effectiveness and benefits of the new rating-based
reinforcement learning approach.
Rating-based Reinforcement Learning. (arXiv:2307.16348v2 [cs.LG] UPDATED)
https://ift.tt/bQApWYH
This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.
via cs.RO updates on arXiv.org http://arxiv.org/