spacemanidol / MSMARCO

Utilities, Baselines, Statistics and Descriptions Related to the MSMARCO DATASET
MIT License
190 stars 41 forks source link

keyerror #16

Closed zkt12 closed 5 years ago

zkt12 commented 5 years ago

Why there's a keyerror when I run predict on dev_v2.1.json? Here is the traceback: "Augmenting with pre-trained embeddings... Augmenting with random char embeddings... Traceback (most recent call last): File "scripts/", line 189, in main() File "scripts/", line 182, in main toks = ' '.join(id_to_token[tok] for tok in toks) File "scripts/", line 182, in toks = ' '.join(id_to_token[tok] for tok in toks) KeyError: tensor(507946)"

spacemanidol commented 5 years ago


The and files are works in progress and do not work yet. To start your baseline please refer to the file. While not fully productionalized it is a fully functional baseline.

zkt12 commented 5 years ago

Sorry, I didn't make it clear in my last question. I run predict in the Q+A task, not in the ranking task. Why did the keyerror happen?

spacemanidol commented 5 years ago

Hey I dont know what may be going on for you. I've been testing and I created a clean ubuntu box, downloaded the data and repo, followed the instructions for training(trained for 1 epoch) and then ran prediction no problem both with no custom embeddings and using glove. I think this is likley related to your embeddings.

erasmus@spacemanidol:~/MSMARCOV2/Q+A/Baseline$ python3 scripts/ exp/ ../data/dev_v2.1.json prediction.json --cuda=True --word_rep ../data/glove.840B.300d.txt /usr/local/lib/python3.5/dist-packages/h5py/ FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Loading Model... Augmenting with pre-trained embeddings... Augmenting with random char embeddings... erasmus@spacemanidol:~/MSMARCOV2/Q+A/Baseline$