rixwew / pytorch-fm

Factorization Machine models in PyTorch
MIT License
1.04k stars 225 forks source link

about "self.offsets" some questions #15

Open CallmeChenChen opened 4 years ago

CallmeChenChen commented 4 years ago

Dear DaLao: what's the function of "self.offsets" ?

self.offsets = np.array((0, *np.cumsum(field_dims)[:-1]), dtype=np.long) def forward(self, x): x = x + x.new_tensor(self.offsets).unsqueeze(0)

KwangKa commented 3 years ago

feature index offset of each field

cpy18727 commented 3 years ago

Dear DaLao: what's the function of "self.offsets" ?

self.offsets = np.array((0, *np.cumsum(field_dims)[:-1]), dtype=np.long) def forward(self, x): x = x + x.new_tensor(self.offsets).unsqueeze(0)

    # e.g. field_dims = [2, 3, 4, 5], offsets = [0, 2, 5, 9]
    # 索引的偏移量
    # 因为所有特征共用一个 Embedding表
    # 所以,实际表中 0~1行  对应 特征 X0, 即 field_dims[0]
    #               2~4行  对应 特征 X1, 即 field_dims[1]
    #               5~8行  对应 特征 X2, 即 field_dims[2]
    #               9~14行 对应 特征 X3, 即 field_dims[3]
    # 但实际特征取值 forward(self, x) 的 x大小 只在自身词表内取值
    # 比如:X1取值0,对应Embedding内行数就是 offsets[X1] + X1 = 2 + 0 = 2
tsWen0309 commented 2 years ago

细节上说的有一点小小的错误