Open GoogleCodeExporter opened 9 years ago
I think this is rather an issue with the BerkeleyParser implementation, not
with DKPro Core.
MaltParser chokes because it finds no pos-tags. I'd recommend using a separate
pos tagger and setting PARAM_WRITE_POS on BerkeleyParser to "false". Some
people observed that running a pos tagger separately can actually yield better
results anyway. Mind though, that contrary to the StanfordParser component, the
BerkeleyParser component currently does not support using pre-existing pos tags
produced by a separate tagger (I do not know if the Berkeley parser upstream
code supports this).
Regarding the underlying problem with the BerkeleyParser, I'd suggest reporting
this as an upstream issue. Maybe we are using the API incorrectly. If we knew
what exactly the issue was, maybe we could implement a workaround in DKPro
Core, but I'd actually prefer updating to a newer upstream version. I do
believe, though, that upstream is not really actively maintained... still worth
a try.
Original comment by richard.eckart
on 25 Jun 2014 at 12:45
The DKPro Core BerkeleyParser component internally uses the
CoarseToFineMaxRuleParser form the Berkeley package. The package appears to
include other parsers as well:
CoarseToFineMaxRuleDerivationParser
CoarseToFineMaxRuleProductParser
CoarseToFineNBestParser
CoarseToFineTwoChartsParser
ConstrainedTwoChartsParser
ConstrainedHierarchicalTwoChartParser
I don't know of the models are compatible with all of them or only for a
specific parser. Might be worth investigating. Maybe another parser does not
have the problem of returning no result on certain sentences.
Original comment by richard.eckart
on 25 Jun 2014 at 12:50
Hi Richard,
thanks for your reply. So as a quick workaround, enhancing the pipeline to
...
createEngineDescription(BerkeleyParser.class, BerkeleyParser.PARAM_WRITE_POS,
false),
createEngineDescription(StanfordPosTagger.class),
...
did the trick.
Looking at the BerkeleyParser googlecode project, the last commit from 2012, it
seems to be dead for a while...
Best,
Ivan
Original comment by ivan.hab...@gmail.com
on 25 Jun 2014 at 1:43
Original comment by richard.eckart
on 28 Jul 2014 at 9:46
Original issue reported on code.google.com by
ivan.hab...@gmail.com
on 25 Jun 2014 at 12:35