A question for Fraction Proposal Network in FQF

toshikwa / fqf-iqn-qrdqn.pytorch

PyTorch implementation of FQF, IQN and QR-DQN.

MIT License

161 stars 24 forks source link

A question for Fraction Proposal Network in FQF #2

Closed fmxFranky closed 4 years ago

fmxFranky commented 4 years ago

I am learning FQF recent days. Thanks for the repo that I can learn the algorithm more efficiently~ I found that the Fraction Proposal Net's input in FQF is (s, a) which mentioned in the paper(Algorithm 1). But your implementation made all actions share quantiles/taus for the same state. I'm looking forward to your reply to the conflict. Thank you very much!

toshikwa commented 4 years ago

Hi @fmxFranky

Indeed, according to theory we should calculate the gradients at all actions. However we found that too costly and only has insignificant improvements, so in our implementation the proposed fractions are only dependent to states, i.e. only calculated at action chosen by \pi(~|s) and shared among all actions.

Above is the personal contact with the author about Fraction Proposal Networks. I think it answers to your question, right?

fmxFranky commented 4 years ago

Hi, @ku2482 Thanks for your reply~