Closed Reblexis closed 1 week ago
I think there is a mistake in the first step of the proof (where it's evaluated into the right form).
Yes, you are perfectly right -- with the previous formulation, the $\tau$ behaved in fact as $1-\tau$.
Using $\hat x - x$ seems more natural to me (the derivative of this term with respect to $\hat x$ is 1), but the QR-DQN and IQN papers consistently use the order $x - \hat x$ (it is also in the algorithms which are copy-pasted in the slides), so I changed the formulation to use the $x - \hat x$ order; I also updated the order on the previous slides with the MSE and MAE errors for consistency. Also, the definition of quantile Huber loss has been fixed accordingly.
Sorry for complications; you get 3 community work points for finding it and creating this issue (one gets 1 point for a typo, which is incomparable to this kind of error).
Hi,
I think there is a mistake in the quantile regression loss definition at slide 29 (lecture 5). The indicator should be the other way around. $L(\hat{x}) = \mathbb{E}_{x \sim P} \left[ (\hat{x}-x) \left( [{x <= \hat{x}}] - \tau \right) \right]$
I'm wondering if the proof didn't find $\tau$ at which the incorrect formula is maximized.