cjymz886 / text-cnn

嵌入Word2vec词向量的CNN中文文本分类
MIT License
444 stars 117 forks source link

运行text_train.py时报错 #8

Open imshxz opened 5 years ago

imshxz commented 5 years ago

您好,非常感谢您的分享~ 但我更换自己的数据集并运行train_word2vec.py重新训练词向量后,text_train.py报错显示: ValueError: Too many elements provided. Needed at most 512000, but received 800000 后来我将text_model.py的vocab_size和loade.py中的build_vocab(filenames,vocab_dir,vocab_size=5000)均改为5000,重新训练词向量后再运行text_train.py,报错显示: ValueError: Too many elements provided. Needed at most 320000, but received 500000 请问我该如何解决这个问题?

imshxz commented 5 years ago

找到原因啦,是train_word2vec.py里面 model = word2vec.Word2Vec(sentences, size=100, window=5, min_count=1, workers=6) 的size设置为100,大于text_model里的vocab_size=64, 将两个值改为一致or调小word2vec的size值or调大vocab_size值即可解决

qingnuan commented 2 years ago

找到原因啦,是train_word2vec.py里面 model = word2vec.Word2Vec(sentences, size=100, window=5, min_count=1, workers=6) 的size设置为100,大于text_model里的vocab_size=64, 将两个值改为一致or调小word2vec的size值or调大vocab_size值即可解决

您好,您之前说的把vocab_size调为5000现在是调成64吗,因为遇到相同的问题,不过我的是ValueError: Too many elements provided. Needed at most 999900, but received 1000000,所以想请教一下解决办法