When I try to run the code, I get the following error:
Traceback (most recent call last):
File "train.py", line 60, in
run()
File "train.py", line 35, in run
lambda:run_train_from_args(args, hyperdrive_hyperparameter_overrides), args.debug
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
func()
File "train.py", line 35, in
lambda:run_train_from_args(args, hyperdrive_hyperparameter_overrides), args.debug
File "/home/yang/Desktop/FUNDED_NISL/FUNDED/cli_utils/training_utils.py", line 249, in run_train_from_args
DataSplit.Preprocess(args.data_path)
File "/home/yang/Desktop/FUNDED_NISL/FUNDED/data/data/data_preprocess.py", line 531, in Preprocess
w2v(path, cwetype)
File "/home/yang/Desktop/FUNDED_NISL/FUNDED/data/data/data_preprocess.py", line 67, in w2v
negative=3, sample=0.001, hs=1, workers=4)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/word2vec.py", line 783, in init
fast_version=FAST_VERSION)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 763, in init
end_alpha=self.min_alpha, compute_loss=compute_loss)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/word2vec.py", line 910, in train
queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 1081, in train
kwargs)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 536, in train
total_words=total_words, kwargs)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 1187, in _check_training_sanity
raise RuntimeError("you must first build vocabulary before training the model")
RuntimeError: you must first build vocabulary before training the model
I find the faulty line of code:
model = Word2Vec(words, min_count=1, size=100, sg=1, window=5,negative=3, sample=0.001, hs=1, workers=4)
But shouldn't the model automatically build a vocabulary during training?
When I try to run the code, I get the following error:
Traceback (most recent call last): File "train.py", line 60, in
run()
File "train.py", line 35, in run
lambda:run_train_from_args(args, hyperdrive_hyperparameter_overrides), args.debug
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
func()
File "train.py", line 35, in
lambda:run_train_from_args(args, hyperdrive_hyperparameter_overrides), args.debug
File "/home/yang/Desktop/FUNDED_NISL/FUNDED/cli_utils/training_utils.py", line 249, in run_train_from_args
DataSplit.Preprocess(args.data_path)
File "/home/yang/Desktop/FUNDED_NISL/FUNDED/data/data/data_preprocess.py", line 531, in Preprocess
w2v(path, cwetype)
File "/home/yang/Desktop/FUNDED_NISL/FUNDED/data/data/data_preprocess.py", line 67, in w2v
negative=3, sample=0.001, hs=1, workers=4)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/word2vec.py", line 783, in init
fast_version=FAST_VERSION)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 763, in init
end_alpha=self.min_alpha, compute_loss=compute_loss)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/word2vec.py", line 910, in train
queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 1081, in train
kwargs)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 536, in train
total_words=total_words, kwargs)
File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 1187, in _check_training_sanity
raise RuntimeError("you must first build vocabulary before training the model")
RuntimeError: you must first build vocabulary before training the model
I find the faulty line of code:
But shouldn't the model automatically build a vocabulary during training?
Can someone help me with this problem