nzw0301 / pytorch_skipgram

3 stars 2 forks source link

GPU support #3

Closed tatsuokun closed 6 years ago

tatsuokun commented 6 years ago

Support GPU by some practical modifications under pep8 style guide. It takes approximately 15 mins to train the model on GTX 1080.

taoki@rat:~/my_repos/pytorch_skipgram$ time python -m pytorch_skipgram.explicit_main.py --input=data/text8 --epoch=1 --out=text8.vec --min-count=5 --sample=1e-5 --batch=100 --negative=10 --gpu-id 0                       
Loading training corpus
V:71290, #words:16718844
progress: 1.0000000, lr=0.0002500, loss=3.1974219

real    13m20.806s
user    12m56.856s
sys     0m29.180s

>>> from gensim.models import KeyedVectors                                                                    
>>> word_vectors = KeyedVectors.load_word2vec_format('./text8.vec', binary=False)                             
>>> word_vectors.most_similar(positive=['woman', 'king'], negative=['man'])                                   
[('emperor', 0.940045177936554), ('emperors', 0.9369003772735596), ('reigned', 0.9173333644866943), ('iii', 0$9173274636268616), ('crowned', 0.9153728485107422), ('bohemia', 0.9113254547119141), ('elector', 0.91019010543
82324), ('householder', 0.9099608659744263), ('habsburg', 0.9084455966949463), ('julian', 0.9075143337249756)]
>>> word_vectors.doesnt_match("breakfast cereal dinner lunch".split())                                        
'cereal'
>>> word_vectors.similarity('woman', 'man')                                                                   
0.8903952724650204
nzw0301 commented 6 years ago

Awesome! Thank you for your contribution!