ChenBaiyang / MAUIL

The code and dataset for paper "MAUIL: Multi-level Attribute Embedding for Semi-supervised User Identity Linkage"
MIT License
24 stars 10 forks source link

dblp运行成功,wd运行时,出现了以下的错误,下载了zhwiki-latest-pages-articles.xml.bz2,生成的zhwiki_corpus没有数据。 #7

Closed Simple-Kay closed 8 months ago

Simple-Kay commented 8 months ago

Tue Mar 5 19:47:57 2024 Character level attributes embedding... Tue Mar 5 19:48:04 2024 Word level attributes embedding... Traceback (most recent call last): File "D:\File\PyCharm Files\MAUIL-main\code\embed.py", line 375, in embed_wd() File "D:\File\PyCharm Files\MAUIL-main\code\embed.py", line 341, in embed_wd ex_corpus=True, ex_corpus_fname=ex_corpus_fname, ex_corpus_xml=ex_corpus_xml) File "D:\File\PyCharm Files\MAUIL-main\code\embed.py", line 306, in word_embed_cn return word_embed(docs, excorpus=iter, lamb=lamb, dim=dim, ave_neighbors=ave_neighbors, g1=g1, g2=g2) File "D:\File\PyCharm Files\MAUIL-main\code\embed.py", line 113, in word_embed model = Word2Vec(sentences=ex_corpus, size=dim, workers=os.cpu_count() - 1) File "D:\Software\miniconda3\envs\MAUIL\lib\site-packages\gensim\models\word2vec.py", line 783, in init Tue Mar 5 19:48:05 2024 Learning word vectors... fast_version=FAST_VERSION) File "D:\Software\miniconda3\envs\MAUIL\lib\site-packages\gensim\models\base_any2vec.py", line 763, in init end_alpha=self.min_alpha, compute_loss=compute_loss) File "D:\Software\miniconda3\envs\MAUIL\lib\site-packages\gensim\models\word2vec.py", line 910, in train queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks) File "D:\Software\miniconda3\envs\MAUIL\lib\site-packages\gensim\models\base_any2vec.py", line 1081, in train kwargs) File "D:\Software\miniconda3\envs\MAUIL\lib\site-packages\gensim\models\base_any2vec.py", line 536, in train total_words=total_words, kwargs) File "D:\Software\miniconda3\envs\MAUIL\lib\site-packages\gensim\models\base_any2vec.py", line 1187, in _check_training_sanity raise RuntimeError("you must first build vocabulary before training the model") RuntimeError: you must first build vocabulary before training the model