To produce word-level predictions on test data, one currently has to create artificial dummy labels to use along with the data during training. This is inconvenient as it prevents from using a pretrained model to produce labels for unseen data.
Expected behavior
One should be able to reuse a pretrained word-level QE model to predict word-level labels for unseen data, using the 'sampling' mode.
Summary
To produce word-level predictions on test data, one currently has to create artificial dummy labels to use along with the data during training. This is inconvenient as it prevents from using a pretrained model to produce labels for unseen data.
Expected behavior
One should be able to reuse a pretrained word-level QE model to predict word-level labels for unseen data, using the 'sampling' mode.
Related issue(s): #3