I changed the code to do the validation on a dev set instead of the test set. But when I wanted to test the model on my test set, i got an error when mapping words_to_ids (reader/base.py), this is due to the fact that the vocab.txt file was constructed only on the train and dev data.
Do I need to use three of the train, dev and test sets to construct this vocab.txt file ?
Wouldn't this be a heavy constraint when using the model to predict on new data that we don't know its vocabulary in advance ?
Thanks for your answer in advance.
I changed the code to do the validation on a dev set instead of the test set. But when I wanted to test the model on my test set, i got an error when mapping words_to_ids (reader/base.py), this is due to the fact that the vocab.txt file was constructed only on the train and dev data. Do I need to use three of the train, dev and test sets to construct this vocab.txt file ?
Wouldn't this be a heavy constraint when using the model to predict on new data that we don't know its vocabulary in advance ? Thanks for your answer in advance.