yanshao9798 / tagger

A Joint Chinese segmentation and POS tagger based on bidirectional GRU-CRF
151 stars 44 forks source link

could not convert string to float: HEAD #5

Closed GabrielLin closed 6 years ago

GabrielLin commented 6 years ago

('Encoding: ', 'utf-8') Reading data...... Reading embeddings... Traceback (most recent call last): File "tagger.py", line 119, in toolbox.get_sample_embedding(path, args.embeddings, chars) File "/data1/myname/nlp/tagger/toolbox.py", line 112, in get_sample_embedding emb_dic[sets[0]] = np.asarray(sets[1:], dtype='float32') File "/opt/anaconda2/envs/tf1p3py27/lib/python2.7/site-packages/numpy/core/numeric.py", line 531, in asarray return array(a, dtype, copy=False, order=order) ValueError: could not convert string to float: HEAD

When I run: python -u tagger.py train -p ud1 -t train.txt -d dev.txt -wv -cp -rd -gru -m model_ud1 -emb Embeddings/glove.txt

Could you please give me some suggestions? Thanks.

yanshao9798 commented 6 years ago

Hi, it looks like something is wrong with the embedding file. Are you using the original glove.txt? You can also omit the pre-trained embeddings and see if it works with random embeddings. i.e. python -u tagger.py train -p ud1 -t train.txt -d dev.txt -wv -cp -rd -gru -m model_ud1

GabrielLin commented 6 years ago

Thanks for your reply. I am afraid that you add something about git to the glove.txt file.

In the last repo version, its first line is

<<<<<<< HEAD

and its last line is

>>>>>>> 9b45bd881c27df3e0d8bee0f071d7e77a8126f0b

When I remove those two lines, the original command can be run.

Please also help to check that whether the last repo of glove.txt file is the newest version you want to add or not.

yanshao9798 commented 6 years ago

Hi, you are right. Now it has been fixed. Thanks for pointing out the problem!