HuantWang / FUNDED_NISL

FUNDED is a novel learning framework for building vulnerability detection models.
126 stars 36 forks source link

RuntimeError: you must first build vocabulary before training the model #22

Closed sywu213 closed 1 year ago

sywu213 commented 1 year ago

When I try to run the code, I get the following error:

Traceback (most recent call last): File "train.py", line 60, in run() File "train.py", line 35, in run lambda:run_train_from_args(args, hyperdrive_hyperparameter_overrides), args.debug File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug func() File "train.py", line 35, in lambda:run_train_from_args(args, hyperdrive_hyperparameter_overrides), args.debug File "/home/yang/Desktop/FUNDED_NISL/FUNDED/cli_utils/training_utils.py", line 249, in run_train_from_args DataSplit.Preprocess(args.data_path) File "/home/yang/Desktop/FUNDED_NISL/FUNDED/data/data/data_preprocess.py", line 531, in Preprocess w2v(path, cwetype) File "/home/yang/Desktop/FUNDED_NISL/FUNDED/data/data/data_preprocess.py", line 67, in w2v negative=3, sample=0.001, hs=1, workers=4) File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/word2vec.py", line 783, in init fast_version=FAST_VERSION) File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 763, in init end_alpha=self.min_alpha, compute_loss=compute_loss) File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/word2vec.py", line 910, in train queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks) File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 1081, in train kwargs) File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 536, in train total_words=total_words, kwargs) File "/home/yang/Desktop/tensor/lib/python3.7/site-packages/gensim/models/base_any2vec.py", line 1187, in _check_training_sanity raise RuntimeError("you must first build vocabulary before training the model") RuntimeError: you must first build vocabulary before training the model

I find the faulty line of code:

model = Word2Vec(words, min_count=1, size=100, sg=1, window=5,negative=3, sample=0.001, hs=1, workers=4)

But shouldn't the model automatically build a vocabulary during training?

Can someone help me with this problem

Sillouevan commented 6 months ago

I also encountered the same problem. Could you please tell me how you solved it?

664730 commented 5 months ago

I also encountered the same problem. Could you please tell me how you solved it?

Excuse me, do you know now