larsmans / seqlearn

Sequence learning toolkit for Python
http://larsmans.github.io/seqlearn/
MIT License
688 stars 102 forks source link

transition feature #36

Open kwkwvenusgod opened 7 years ago

kwkwvenusgod commented 7 years ago

I tried out to enable transition feature in perceptron learning. After I read the source code, I found the implementation is not consistent to the comments of make_trans_matrix(y, n_classes, dtype=np.float64) in transmatrix.py. Based on my understanding, only relying on the coefficient w and the label count matrix it quite easy to result in some label bias problem, for in real cases based on BIO tagging technique, the label of 'O' will be quite predominant in feature space distribution and label count matrix. So the transition feature make such assumption that the feature distribution of one label's previous one can be consistent and has pattern. As a result the transition feature will resolve some label bias issues. I am not sure my interpretation of transition feature is correct or not. And I also modify corresponding code. If you like I will submit a merge request