facebookresearch / InferSent

InferSent sentence embeddings
Other
2.28k stars 471 forks source link

KeyError: '\ufeffneutral' when using train_nli.py with own data #115

Closed JakobDerPharao closed 5 years ago

JakobDerPharao commented 5 years ago

Hello,

I wanted to train my own model using my NMT translated data of MNLI, the German fasttext vectors and the German XNLI data.

When executing " python train_nli.py --nlipath dataset/7/ --word_emb_path cc.de.300.vec --encoder_type InferSentV2 --gpu_id 0 " I got following Error:

TRAIN DATA : Found 309354 pairs of train sentences. DEV DATA : Found 17681 pairs of dev sentences. Traceback (most recent call last): File "train_nli.py", line 78, in train, valid, test = get_nli(params.nlipath) File "C:\Users\Jakob\Desktop\sepro\InferSent-master2\InferSent-master\data.py", line 78, in get_nli for line in open(target[data_type]['path'], 'r',encoding="utf8")]) File "C:\Users\Jakob\Desktop\sepro\InferSent-master2\InferSent-master\data.py", line 78, in for line in open(target[data_type]['path'], 'r',encoding="utf8")]) KeyError: '\ufeffneutral'

I previously had the encoding="utf8" to include the German umlauts. Thanks already for your help.