yuzhimanhua / Multi-BioNER

Cross-type Biomedical Named Entity Recognition with Deep Multi-task Learning (Bioinformatics'19)
https://arxiv.org/abs/1801.09851
Apache License 2.0
132 stars 28 forks source link

ValueError: could not convert string to float: '-0\x00\x00\ #11

Closed YuliaInn closed 3 years ago

YuliaInn commented 4 years ago

I tried to reproduce your experiment from the paper. However, I got this:

loading embedding /home/iid49/Multi-BioNER/model/utils.py:779: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_. nn.init.uniform(input_embedding, -bias, bias) Traceback (most recent call last): File "train_wc.py", line 169, in f_map, embedding_tensor, in_doc_words = utils.load_embedding_wlm(args.emb_file, ' ', f_map, dt_f_set, args.caseless, args.unk, args.word_dim, shrink_to_corpus=args.shrink_embedding) File "/home/iid49/Multi-BioNER/model/utils.py", line 403, in load_embedding_wlm vector = list(map(lambda t: float(t), filter(lambda n: n and not n.isspace(), line[1:]))) File "/home/iid49/Multi-BioNER/model/utils.py", line 403, in vector = list(map(lambda t: float(t), filter(lambda n: n and not n.isspace(), line[1:]))) ValueError: could not convert string to float: '-0\x00....

how do I fix it?

thank you

yuzhimanhua commented 4 years ago

HI, could you tell me what embedding you are using?

YuliaInn commented 4 years ago

I am using the data for "Quick Start" which has wikipedia-pubmed-and-pmc-w2v.txt file. Is that the right one?

yuzhimanhua commented 4 years ago

Sorry for the late reply. Yes, you are using the right one. I am not sure why you got that execution error.

Could you please refer to the embedding files in

https://github.com/BaderLab/Transfer-Learning-BNER-Bioinformatics-2018

or

http://bio.nlplab.org/#word-vectors

?