Closed Jasonxu1225 closed 1 year ago
I find that I was wrong. Sorry.
After specific considerations, I think maybe the notebook version is correct, namely using K to get_action as a sample. And I modify the script version, where making IQN to get_action with K.
I have started a new PR. Please check it. Thank you very much!
In the paper, we should use K to get action. So I think the script version is wrong because it use N to get_action instead of K. (Although the paper says that IQN Is not sensitive to the value of K). So I modify the IQN to get_action with K, which corresponding to the paper.