A question about the code of " BertTrainDataset.__getitem__(self, index)" function

jaywonchung / BERT4Rec-VAE-Pytorch

Pytorch implementation of BERT4Rec and Netflix VAE.

GNU General Public License v3.0

356 stars 84 forks source link

A question about the code of " BertTrainDataset.getitem(self, index)" function #7

Open JiekeLi opened 4 years ago

JiekeLi commented 4 years ago

I want to ask a question that why you give '0' label to the none masked token when generate a example here:

     for s in seq:
            prob = self.rng.random()
            if prob < self.mask_prob:
                prob /= self.mask_prob

                if prob < 0.8:
                    tokens.append(self.mask_token)
                elif prob < 0.9: 
                    tokens.append(self.rng.randint(1, self.num_items))
                else: 
                    tokens.append(s)

                labels.append(s) 
            else:
                tokens.append(s)
                labels.append(0) # **why give 0 label here and not the index of s ?**

Thanks !

thomalm commented 3 years ago

I was also wondering the same. Masking should only be applied to input tokens

reckzhou commented 3 years ago

because loss function is CrossEntropyLoss(ignore_index=0) ，it ignored index for zero，only compute loss for mask item or random item。

but :

if prob < 0.8: 
    tokens.append(self.mask_token)
    labels.append(s)
elif prob < 0.9: 
    tokens.append(self.rng.randint(1, self.num_items))
    labels.append(s)
else: 
    tokens.append(s)
    labels.append(0)  #? I changed it.

kyeongchan92 commented 1 year ago

And why divide one more time using 0.8, 0.9 after mask_prob? Random item insertion on 0.15 * 0.1 ? I couldn't find this part on paper

jaywonchung / BERT4Rec-VAE-Pytorch

A question about the code of " BertTrainDataset.__getitem__(self, index)" function #7

A question about the code of " BertTrainDataset.getitem(self, index)" function #7