The test set and training set are pre-tokenized and no description about the tokenization process is provided.
Tokenization affects both the performance of off-the-shell parser and BLEU computation.
It would be helpful for rigorous research to supply the tokenization script, or a detokenization, or a un-tokenized version of train&test set.
The test set and training set are pre-tokenized and no description about the tokenization process is provided. Tokenization affects both the performance of off-the-shell parser and BLEU computation. It would be helpful for rigorous research to supply the tokenization script, or a detokenization, or a un-tokenized version of train&test set.