jacksonllee / pycantonese

Cantonese Linguistics and NLP
https://pycantonese.org
MIT License
354 stars 38 forks source link

Copy-paste error in tagger implementation #34

Closed ZhanruiLiang closed 1 year ago

ZhanruiLiang commented 1 year ago

Describe the bug Code location: https://github.com/jacksonllee/pycantonese/blob/main/src/pycantonese/pos_tagging/tagger.py#L262 From the context, i+2 should be used but it has i-2 currently. I tried to fix and and regen the model pickle file, but that fails some tests which I don't know how to proceed.

To reproduce

Expected behavior All tests pass.

System (please complete the following information):

jacksonllee commented 1 year ago

I've resolved this issue at the main branch. Some of the tests got upset due to noise in the training data from HKCanCor -- that's been fixed as well.

You've been acknowledged in the readme. Thank you for reporting this issue!