Open fangleigit opened 9 months ago
Hi @fangleigit, good question! Since Criteo only supports one-hot encodings, we've implemented QR here this way. In this case, these two approaches are equivalent.
However, you are correct that ideally the implementation should perform the operation before the sum or mean pooling - this would require a more involved implementation that requires changing the underlying embedding bag implementation. We have not implemented this here.
When attempting to use the QR trick, I have noticed that the implementation differs from what is described in the original paper. In the paper, the embedding of a token is obtained by applying operations such as 'add', 'mult', or 'concat' to two separate embedding tables, after which sum or mean pooling is applied. However, the implementation https://github.com/facebookresearch/dlrm/blob/c848e837580cbcfe6b49149658e4c8a3c1576f48/tricks/qr_embedding_bag.py#L189 first applies pooling to the embeddings from the separate tables and then applies 'add', 'mult', or 'concat' to obtain the embedding feature. I am unsure whether this difference is by design or if the two methods are equivalent.