Open shunjizhan opened 5 years ago
when training, we used sentence.split() instead of word_tokenize() from nltk, since this will cause some little bug in training, but in the future, use tokenize to train might get better result.
(actually I think this doesn't matter)
Such as: https://sklearn-crfsuite.readthedocs.io/en/latest/tutorial.html#hyperparameter-optimization