taishi-i / nagisa

A Japanese tokenizer based on recurrent neural networks
https://huggingface.co/spaces/taishi-i/nagisa-demo
MIT License
379 stars 22 forks source link

Is it possible to apply pos-tagging to a list of words, which was already tokenized? #8

Closed ichiroex closed 5 years ago

taishi-i commented 5 years ago

Hi @ichiroex

Yes, it is possibile to apply pos-tagging to a list of tokenized words. However, I need to rewrite some existing codes. I will inform you after rewriting the codes. Please wait a few days.

ichiroex commented 5 years ago

Thank you for your quick reply. I'm looking forward to the function πŸ‘

taishi-i commented 5 years ago

Hi @ichiroex

I released nagisa 0.1.2, which provides the pos-tagging method. If you want to apply pos-tagging to a list of tokenized words, please refer to the following code. Don't forget update the latest version of nagisa sudo pip install -U nagisa.

import nagisa # version 0.1.2

tokenized_words = [" (δΊΊβ€’α΄—β€’β™‘)","こんばんは","β™ͺ"]
postags = nagisa.postagging(tokenized_words)
print(postags) #=> ['補助記号', 'ζ„Ÿε‹•θ©ž', '補助記号']

Thanks

ichiroex commented 5 years ago

Thank you for your prompt attention to this matter. It's very nice! I'll use this wonderful function πŸ‘―