recap-utr / qualiAssistant

Identify Qualia structures in texts
GNU General Public License v3.0
2 stars 1 forks source link

What's the difference of NN, NNS, NP "query" in the csv files of English and German? #4

Open UWong-cmyk opened 8 months ago

UWong-cmyk commented 8 months ago

Hello! I've read your paper about syntactic pattern of German and English to find related qualia structures, and I've found the related csv files of syntactic patterns. However, I couldn't differentiate the NPs in English qualia patterns and German qualia patterns. I find that the NP which follows [NN, NNS],NP in the English csv files is necessary.

image

While in the German one, the NP in the similar pattern seems optional.

image

image

Why is the NP in English necessary? Thank you for your attention.

lorikdumani commented 8 months ago

Hi, you have spotted that very well. In fact, NPs in the patterns are necessary for English, but not so much for German. For clarity, take a look at the constituency trees. On closer inspection, you can observe that the POS tagger for the German language rather rarely yields matches containing NPs, while the POS tagger for the English language almost exclusively produces matches featuring NPs. Hence, our patterns reflect our observations of the POS taggers' working methods.

However, the pattern [NP,NOUN] does not mean that NP is optional, but can be chosen. This means that the match in this example must either start with NP or with NOUN. I suppose what you mean is that NP could have been left out of the German patterns since the differences in the result would probably have been negligible. Optional POS tags are enclosed in brackets (such as (DET) or (ADJ) in the same pattern).

UWong-cmyk commented 8 months ago

Thank you for your clear explanation! The NP or NOUN can be selected both in German form.

However, what puzzled me is that in English csv form, does that mean the NP must follow the [NN, NNS] instead of [NN, NNS, NP]? It seems that sometimes NN or NNS can be also represented as a NP in the higher level of themselves in the constituency tree. If NN or NNS is followed by a NP, are there any grammar errors here?

Sorry to disturb you again.

lorikdumani commented 8 months ago

Don't worry, of course you are not disturbing anyone.

After reviewing the code, we found a bug that is responsible for this phenomenon. You are absolutely right that if NN or NNS is followed by a NP, it actually leads to a grammatical error. However, the bug is tricky because it also makes the NP (as you noted) optional, i.e. as if it were in the square brackets. Nevertheless, it should actually be there anyway. Long story short: We will fix the bugs as soon as possible. Many thanks for your great help in finding them!