twitter / twitter-korean-text

Korean tokenizer
Apache License 2.0
857 stars 172 forks source link

Add Java version of KoreanPos and KoreanToken for better compatibility #60

Closed hohyon-ryu closed 9 years ago

modamoda commented 9 years ago

I hope this code snippet might help you. :smile: (and I also welcomes any advice on the code snippet) Syntax of C# is very similar to Java

public class KoreanToken
{
    public string Text { get; set; }
    public KoreanPos Pos { get; set; }
    public bool Unknown { get; set; }

    public KoreanToken(com.twitter.penguin.korean.tokenizer.KoreanTokenizer.KoreanToken scalaToken)
    {
        this.Text = scalaToken.text().toString();
        this.Pos = (KoreanPos)scalaToken.pos().id();
        this.Unknown = scalaToken.unknown();
    }
}

public enum KoreanPos
{
    // Word leved POS
    Noun, Verb, Adjective,
    Adverb, Determiner, Exclamation,
    Josa, Eomi, PreEomi, Conjunction,
    NounPrefix, VerbPrefix, Suffix, Unknown,

    // Chunk level POS
    Korean, Foreign, Number, KoreanParticle, Alpha,
    Punctuation, Hashtag, ScreenName,
    Email, URL, CashTag,

    // Functional POS
    Space, Others
}
hohyon-ryu commented 9 years ago

Oh, thanks!!!

That was very fast! :)

On Thu, Apr 16, 2015 at 6:37 PM Joon Hong notifications@github.com wrote:

I hope this code snippet https://github.com/modamoda/TwitterKoreanProcessorCS/blob/master/TwitterKoreanProcessorCS.cs#L150-L179 might help you. [image: :smile:](and I also welcomes any advice on the code snippet) Syntax of C# is very similar to Java

public class KoreanToken { public string Text { get; set; } public KoreanPos Pos { get; set; } public bool Unknown { get; set; }

public KoreanToken(com.twitter.penguin.korean.tokenizer.KoreanTokenizer.KoreanToken scalaToken)
{
    this.Text = scalaToken.text().toString();
    this.Pos = (KoreanPos)scalaToken.pos().id();
    this.Unknown = scalaToken.unknown();
}

}

public enum KoreanPos { // Word leved POS Noun, Verb, Adjective, Adverb, Determiner, Exclamation, Josa, Eomi, PreEomi, Conjunction, NounPrefix, VerbPrefix, Suffix, Unknown,

// Chunk level POS
Korean, Foreign, Number, KoreanParticle, Alpha,
Punctuation, Hashtag, ScreenName,
Email, URL, CashTag,

// Functional POS
Space, Others

}

Reply to this email directly or view it on GitHub https://github.com/twitter/twitter-korean-text/issues/60#issuecomment-93875289 .