What steps will reproduce the problem?
Have two spaces or more between words in input
example: echo "a b" | java -jar berkeleyParser.jar -gr eng_sm5.gr
java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at java.lang.String.charAt(String.java:687)
at edu.berkeley.nlp.PCFGLA.SophisticatedLexicon.getSignature(Unknown Source)
at edu.berkeley.nlp.PCFGLA.SophisticatedLexicon.getCachedSignature(Unknown
Source)
at edu.berkeley.nlp.PCFGLA.SophisticatedLexicon.score(Unknown Source)
at
edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.initializeChart(Unknown
Source)
at edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.doPreParses(Unknown
Source)
at
edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.getBestConstrainedParse(Unknow
n
Source)
at
edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.getBestConstrainedParse(Unknow
n
Source)
at edu.berkeley.nlp.PCFGLA.BerkeleyParser.main(BerkeleyParser.java:190)
If there is only one space, one obtains a parse tree.
echo "a b" | java -jar berkeleyParser2.jar -gr eng_sm5.gr
( (NP (DT a) (X (SYM b))) )
If you run the parser with tokenization (-tokenize), it works fine.
Suggestion: track the line number in the input and show it when printing
the trace. Makes debugging easier.
Original issue reported on code.google.com by benoit.f...@gmail.com on 11 Feb 2009 at 9:15
Original issue reported on code.google.com by
benoit.f...@gmail.com
on 11 Feb 2009 at 9:15