Closed emadeldeen24 closed 4 years ago
Hi, no worries.
I think the issue of imbalanced data also happens in part-of-speech tagging where most words are nouns. I'm not sure if people handle this issue in POS tagging because in reality, nouns are indeed very common. What I would suggest is:
Firstly, thank you for sharing the code and making it easy to use. I'm using CRF to classify EEG data, as the labels are sequential and having dependencies.
However, the labels are imbalanced and CRF seems to just produce the labels of the majority class. The use of oversamling is not proper in this case, so I wonder if you may have a solution or suggestion for this issue.
Thanks.