Information

Authors: ikun Xian, Zuohui Fu, S. Muthukrishnan, Gerard de Melo, Yongfeng Zhang
Date: July 2019
Published By: SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieva

Link

Searching the best items to recommend and the paths for the recommended item on the knowledge graph based on reinforcement learning.
beam search-based algorithm: enable to sample diverse reasoning paths and candidate item sets for recommendation efficiently. <- how much effective?

Agent: Discover suitable items and paths to the items. It starts from a user and conducts explicit multi-step path reasoning over the graph.
Environment: knowledge graph
Rewards: (only already bought items are included in KG as interactions and there is no pre-known targeted item, so it is unfeasible to consider binary rewards indication whether the agent has reached a target or not.) -> rewards are computed by an another (multi-hop) scoring function that generate soft rewards (0: if the current entity is not items, f(.) {dot<user embed, target item embed>} / {scaling: max(dot<user embedd + puchased item embed, items>) }: if the current entity is items.).