WorksApplications / SudachiPy

Python version of Sudachi, a Japanese tokenizer.
Apache License 2.0
392 stars 50 forks source link

What's the tagset used by SudachiPy? #88

Closed BLKSerene closed 5 years ago

BLKSerene commented 5 years ago

Hi, thanks for the great SudachiPy and I'm using it in my own project. I'm wondering that is there any reference for the tagset used by SudachiPy, since I want to convert the POS tags to universal POS tags?

izziiyt commented 5 years ago

@BLKSerene Thank you for using and FB !

@kazuma-t Is there postag list, used in Sudachi, on public URL ?

fhudi commented 5 years ago

Hi @BLKSerene, Sudachi is based on UniDic

You can use following public URL as a reference: https://hayashibe.jp/tr/mecab/dictionary/unidic/pos

BLKSerene commented 5 years ago

@lintaoren Thanks for the information!