Closed natewatson999 closed 6 years ago
Hi thank you for identifying a flaw in the algorithm. Unfortunately all of these systems are statistical based parsers, so even the state of the art parsers will make errors due to limited training data and the limits of the algorithm. It is not immediately clear to me how to translate specific error cases into parser improvements other than by adding training data that addresses the specific issue.
At this time we are not actively developing the constituency parser, though we may release a polished version of our latest dependency parser in Python at some point this year. There has been some active work on neural constituency parsing as well by other groups, but to the best of my knowledge no one over here is really working on constituency parsing at this time.
The constituency parse is not parsing "What is being very fat called?" in a logical way.
The tree yielded is: ( What (is (being (very fat called)))), when it should be (what (is ((being (very fat)) called))). "Called" is not part of the phrase "being very fat". It should be an SBAR of the root verb.
This is what it is as of the latest verison:
This is what it should be: