Closed Victor0118 closed 5 years ago
@daemon Could you take a look at this PR?
When we generate the raw text we split them by " ". To count the word number we should keep the separator consistent. I find I will get different word counts between .split() and .split(" ").
Some samples
>>> "test a ".split() ['test', 'a'] >>> "test a ".split(" ") ['test', 'a', '']
@daemon Could you take a look at this PR?
When we generate the raw text we split them by " ". To count the word number we should keep the separator consistent. I find I will get different word counts between .split() and .split(" ").