facebookresearch / SentEval

A python tool for evaluating the quality of sentence embeddings.
Other
2.09k stars 309 forks source link

ImageCaptionRetrieval doesn't work with Infersent #57

Open xuwenshen opened 5 years ago

xuwenshen commented 5 years ago

Traceback (most recent call last): File "infersent.py", line 75, in results = se.eval(transfer_tasks) File "../senteval/engine.py", line 59, in eval self.results = {x: self.eval(x) for x in name} File "../senteval/engine.py", line 59, in self.results = {x: self.eval(x) for x in name} File "../senteval/engine.py", line 119, in eval self.evaluation.do_prepare(self.params, self.prepare) File "../senteval/rank.py", line 39, in do_prepare prepare(params, samples) File "infersent.py", line 38, in prepare params.infersent.build_vocab([' '.join(s) for s in samples], tokenize=False) File "infersent.py", line 38, in params.infersent.build_vocab([' '.join(s) for s in samples], tokenize=False) TypeError: sequence item 0: expected str instance, bytes found

shibu38 commented 5 years ago

Hi @xuwenshen, I am facing the same problem.. Did you find the solution, if yes then can you please share.. Thanks in advance

SCULX commented 8 months ago

Traceback (most recent call last): File "infersent.py", line 75, in results = se.eval(transfer_tasks) File "../senteval/engine.py", line 59, in eval self.results = {x: self.eval(x) for x in name} File "../senteval/engine.py", line 59, in self.results = {x: self.eval(x) for x in name} File "../senteval/engine.py", line 119, in eval self.evaluation.do_prepare(self.params, self.prepare) File "../senteval/rank.py", line 39, in do_prepare prepare(params, samples) File "infersent.py", line 38, in prepare params.infersent.build_vocab([' '.join(s) for s in samples], tokenize=False) File "infersent.py", line 38, in params.infersent.build_vocab([' '.join(s) for s in samples], tokenize=False) TypeError: sequence item 0: expected str instance, bytes found

I just directly change ' ' into b" " in both functions prepare and batcher , like this

params.infersent.build_vocab([b" ".join(s) for s in samples], tokenize=False) 

and it works!