theeluwin / pytorch-sgns

Skipgram Negative Sampling implemented in PyTorch
MIT License
302 stars 59 forks source link

Using of discard probabilities #4

Closed zetyquickly closed 6 years ago

zetyquickly commented 6 years ago

As I can see the 'ws' variable representing discard probabilities of each word is unused. Should it be applied for calculating weights?

zetyquickly commented 6 years ago

Ok. I find out that it can be used in the init method of dataloader class

theeluwin commented 6 years ago

Yep. Calculated from https://github.com/theeluwin/pytorch-sgns/blob/master/train.py#L57 and used in https://github.com/theeluwin/pytorch-sgns/blob/master/train.py#L36 It's a subsampling method from https://arxiv.org/pdf/1310.4546.pdf section 2.3. 2018-06-25 02 53 00