Open PasaOpasen opened 4 years ago
That is because the ngrams() function calls a Wordlist() object which itself calls the words() function. In the source we can see there is a parameter a this level called 'include_punc' setted at False by default. May be if this parameter should be accessed at the ngram() function level to keep the symbols.