ahmetaa / zemberek-nlp

NLP tools for Turkish.
Other
1.14k stars 207 forks source link

Add missing token type #199

Closed ozturkberkay closed 5 years ago

ozturkberkay commented 5 years ago

The type MetaTag (12) was missing in the README.md.

From the original Java code zemberek.tokenization.antlr.TurkishLexer:

  public static final int Abbreviation = 1;
  public static final int SpaceTab = 2;
  public static final int NewLine = 3;
  public static final int Time = 4;
  public static final int Date = 5;
  public static final int PercentNumeral = 6;
  public static final int Number = 7;
  public static final int URL = 8;
  public static final int Email = 9;
  public static final int HashTag = 10;
  public static final int Mention = 11;
  public static final int MetaTag = 12;
  public static final int Emoticon = 13;
  public static final int RomanNumeral = 14;
  public static final int AbbreviationWithDots = 15;
  public static final int Word = 16;
  public static final int WordAlphanumerical = 17;
  public static final int WordWithSymbol = 18;
  public static final int Punctuation = 19;
  public static final int UnknownWord = 20;
  public static final int Unknown = 21;
ahmetaa commented 5 years ago

Thanks.