Does the attention_dim feature in the implemented AutoInt model correspond to the number of hidden units d' in their paper? In section 5.1.4 of their paper they have set d' to 32. So I am wondering if this corresponds to attention_dim = 8 in your code.
Does the attention_dim feature in the implemented AutoInt model correspond to the number of hidden units
d'
in their paper? In section 5.1.4 of their paper they have setd'
to 32. So I am wondering if this corresponds toattention_dim = 8
in your code.