Just ran the pos tagger, using the code below. Unfortunately, the first token seems to be garbage. Can I always assume that this will be the case?
val config = DecodeConfig(IOUtils.createFileInputStream(configUri))
val decoder = NLPDecoder(config)
val tokens = decoder.decode("For god so loved.")
for(p in tokens) println(p)
Output:
0 @#r$% @#r$% @#r$% _ _ _ _ @#r$%
1 For for IN _ _ _ _ @#r$%
2 god god NN pos2=UH _ _ _ @#r$%
3 so so RB _ _ _ _ @#r$%
4 loved love VBD pos2=VBN _ _ _ @#r$%
5 . . . _ _ _ _ @#r$%
Just ran the pos tagger, using the code below. Unfortunately, the first token seems to be garbage. Can I always assume that this will be the case?
Output: