Closed KID0031 closed 10 months ago
@KID0031 hey thanks for your interest
yes you are right that a linear projection would probably do fine there as well. i'm following the weight tied output embedding technique from earlier transformer architectures (which in theory should allow the network to learn better embeddings), but that has been shown to be unnecessary
i'll make it an option to do it the way you describe
@KID0031 try setting this to False
on reflection, i think i had a bug in the weight tied action bin embeddings, so thanks for raising this
Hi, @lucidrains, I'm a beginner trying to use Q-transformer and encountered a question while reading the code. In the
QHeadMultipleActions
class, I noticed that Q-transformer encodes the bin into an embedding usingself.action_bin_embeddings
. However, when obtaining the q value, it multiplies the attention output withself.action_bin_embeddings
once again. Is there a specific reason for using this approach to derive the q value instead of employing a new MLP layer multiplied by the attention output? I've shared the relevant code below. Thank you!