Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
15.6k
stars
3.51k
forks
source link
Quote and single quote are not handled correctly in vocab file where words are not wrapped in quotes #1862
Open
hepaajan opened 4 years ago
Especially following branch will remove the quote so that it becomes empty string (as single quote character starts and ends with quote):
https://github.com/tensorflow/tensor2tensor/blob/5f9dd2db6d7797162e53adf152310ed13e9fc711/tensor2tensor/data_generators/text_encoder.py#L929
easy fix is the check also that "len(s) > 1" in both conditions