Open utterances-bot opened 3 years ago
Thanks for this article. I just tested the code and it does accurate tokenize a string of Japanese into different words. But I am trying to do something more advanced where I tag each word as a noun, verb, etc.
I am just using the basic code example here: https://developer.apple.com/documentation/naturallanguage/identifying_parts_of_speech
But for some reason, it seems to work with English but when I input Japanese it just tags every word as OtherWord. Have you tried using the tagger and had much luck with it? Thanks.
@alamodey I actually have an article on NLTagger here.
Japanese is a highly contextual language, so my immediate guess is that you are handing it simple words. I don't remember much about this API, but I think you can hand it "bigger components" such as sentence to get more accurate results.
I think it's as you have demonstrated in the article that there is support for lexical class in English, but not in Japanese.
Tokenizing Natural Language into Semantic Units in iOS • Andy Ibanez
https://www.andyibanez.com/posts/tokenizing-nltokenizer/