xwhan / DeepPath

code and docs for my EMNLP paper "DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning"
529 stars 135 forks source link

Retraining code is a little different from the algorithm decription. #24

Open zdh2292390 opened 3 years ago

zdh2292390 commented 3 years ago

In policy_agent.py, the retraining code, why there is a BFS teacher-guided training after the agent failed? This is not the same as the algorithm decription. Does this mean BFS is the upper bound of the RL agent?