Closed spirinamayya closed 1 week ago
Let's drop PreLNTransformerLayers
from sasrec.py
To make common positional encoding let's drop multiplying on timeline_mask from it. For sasrec let's move it to SASRectransformerLayers, just before going to the layers
Let's add "TODO: init padding embedding with zeros" to weights initialization code
Added bert4rec model