Closed alanault closed 6 years ago
I saw in issue 48 that the option to control the removal of punctuation was excluded from the tokenize_ngrams function.
tokenize_ngrams
Is there a programming rationale for this? It seems like punctuation can add a lot of value to text, especially in languages like Spanish.
e.g. We're going to the R conference. We're going to the R conference? We're going to the R conference???????
Each has a very different meaning. In Spanish, it can be crucial in understanding whether a sentence is a statement or a question.
I saw in issue 48 that the option to control the removal of punctuation was excluded from the
tokenize_ngrams
function.Is there a programming rationale for this? It seems like punctuation can add a lot of value to text, especially in languages like Spanish.
e.g. We're going to the R conference. We're going to the R conference? We're going to the R conference???????
Each has a very different meaning. In Spanish, it can be crucial in understanding whether a sentence is a statement or a question.