A small function to show how to use the stanford-pos-tagger in MATLAB.
It requires the following files:
english-left3words-distsim.tagger
in the current path while running it. It can be found in $STANFORD_POS_TAGGER_PATH/models/
stanford-postagger.jar
should be added to the classpath.
Matlab command to do it: javaaddpath('$STANFORD_POS_TAGGER_PATH/stanford-postagger.jar')
To run it simply drop it in the current working directory and run:
PosTaggerM(sample_sentence)
Sample input:
This is a very small sample sentence for test purpose - Chomsky.
Sample output:
[This/DT, is/VBZ, a/DT, very/RB, small/JJ, sample/NN, sentence/NN, for/IN, test/NN, purpose/NN, -/:, Chomsky/NNP, ./.]
The result is an ArrayList
of TaggedWords
.
Note on performance:: See discussion on this issue.
File path for english-left3words-distsim.tagger
in Windows:: See discussion on and resolution of this issue.
Verified to work on:
3.3.1
and 3.4.1
of the tagger2010a
, 8.3.0.532 (R2014a)
, R2016a
and R2017a
.JRE version: 1.7
(JRE 7) and 1.8
(JRE 8).
Also, see this issue for more details.
This was initially hosted on my homepage. Douglas found the code and improved it to work with the latest version of the tagger.
@johnnykast helped debug some compatibility issues.
@Sardar-Usama did a detailed analysis of compatibility.