Closed syeedibnfaiz closed 12 years ago
I don't think that you're doing anything wrong. The parsers have different models, each with their own quirks since they're trained off different data. I'd recommend a postprocessing step to remove the unnecessary NPs (if you haven't done this already).
Hope this helps, David
I have found an issue with your biomedical model. It seems that the biomedical model has a tendency to unnecessarily deepening NP's and ADJP's. Consider the following sentence:
Xa and Yb proteins were found .
Using the biomedical model I get a syntax tree with an extra layer of NP's in the subtree: (NP (NP (NN Xa)) (CC and) (NP (NN Yb) (NNS proteins))
As a result of this, when I feed this parse to Stanford dependency parser, I do not get the correct dependency graph (missing 'nn' dependency relation between Xa and proteins). However, if I remove the extra layer of NP's then the dependency graph becomes correct.
The WSJ model does not have this issue. It comes up with the correct parse tree.
Please let me know if I am mistaken or am doing something wrong that is causing the problem.