spacemanidol / MSMARCO

Utilities, Baselines, Statistics and Descriptions Related to the MSMARCO DATASET
MIT License
190 stars 41 forks source link

keyerror #16

Closed zkt12 closed 5 years ago

zkt12 commented 5 years ago

Why there's a keyerror when I run predict on dev_v2.1.json? Here is the traceback: "Augmenting with pre-trained embeddings... Augmenting with random char embeddings... Traceback (most recent call last): File "scripts/predict.py", line 189, in main() File "scripts/predict.py", line 182, in main toks = ' '.join(id_to_token[tok] for tok in toks) File "scripts/predict.py", line 182, in toks = ' '.join(id_to_token[tok] for tok in toks) KeyError: tensor(507946)"

spacemanidol commented 5 years ago

Hey,

The predict.py and train.py files are works in progress and do not work yet. To start your baseline please refer to the duet.py file. While not fully productionalized it is a fully functional baseline.

zkt12 commented 5 years ago

Sorry, I didn't make it clear in my last question. I run predict in the Q+A task, not in the ranking task. Why did the keyerror happen?

spacemanidol commented 5 years ago

Hey I dont know what may be going on for you. I've been testing and I created a clean ubuntu box, downloaded the data and repo, followed the instructions for training(trained for 1 epoch) and then ran prediction no problem both with no custom embeddings and using glove. I think this is likley related to your embeddings.

erasmus@spacemanidol:~/MSMARCOV2/Q+A/Baseline$ python3 scripts/predict.py exp/ ../data/dev_v2.1.json prediction.json --cuda=True --word_rep ../data/glove.840B.300d.txt /usr/local/lib/python3.5/dist-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Loading Model... Augmenting with pre-trained embeddings... Augmenting with random char embeddings... erasmus@spacemanidol:~/MSMARCOV2/Q+A/Baseline$