GrammaticalFramework / gf-wordnet

A WordNet in GF
https://cloud.grammaticalframework.org/wordnet/
24 stars 11 forks source link

Plurale tantum nouns #25

Open inariksit opened 4 years ago

inariksit commented 4 years ago

Do you have a policy how to handle plurale tantum nouns? For example

WordNetSpa.gf:lin day_off_CN = UseN (mkN "vacaciones") ; --guessed
WordNetSpa.gf:lin holiday_1_N = mkN "vacaciones" ;
WordNetSpa.gf:lin leave_1_N = mkN "vacaciones" ; --guessed
WordNetSpa.gf:lin vacation_1a_N = mkN "vacaciones" ;
WordNetSpa.gf:lin vacation_1b_N = mkN "vacaciones" ;

Currently, this produces "un vacaciones", "los vacacioneses", which is incorrect. I would like to correct all these entries like this:

oper vacación_N : N = mkN "vacación" "vacaciones" Fem ;
lin day_off_CN : CN = UseN vacación_N ;
…
lin vacation_1b_N = vacación_N ;

I perceive GF-wordnet as a low-level lexical resource, and that it's the responsibility of an application grammarian to make sure to use holiday_1_N in a plural NP. Is this the way you have thought of GF-wordnet as well, or do you have other visions? I would be okay with oper vacación_N : N = mkN "vacaciones" "vacaciones" Fem as well, with the inconvenience that we'd get "una vacaciones" which doesn't seem correct. But it's still better than "los vacacioneses".

krangelov commented 4 years ago

Hi,

I have also encountered this problem but I haven't fixed it. This happens for instance with "money" which in English is always singular while in Swedish & Bulgarian it is always a plural noun.

I think the right solution is to add a new category for nouns with a fixed number where the number should be specified in the concrete syntax. This is just one of the many grammar extensions which are still pending.

Otherwise, I agree with you that the WordNet should have the same status as the RGL, it should provide as good translations as possible but we know that I will never be perfect. Therefore the user should be able to override it in the application grammar. Still this issue with nouns like money will is not hard to fix and that will improve the accuracy when WordNet is used for general translation.

On Thu, 2 Jul 2020 at 14:27, Inari Listenmaa notifications@github.com wrote:

Do you have a policy how to handle plurale tantum nouns? For example

WordNetSpa.gf:lin day_off_CN = UseN (mkN "vacaciones") ; --guessed WordNetSpa.gf:lin holiday_1_N = mkN "vacaciones" ; WordNetSpa.gf:lin leave_1_N = mkN "vacaciones" ; --guessed WordNetSpa.gf:lin vacation_1a_N = mkN "vacaciones" ; WordNetSpa.gf:lin vacation_1b_N = mkN "vacaciones" ;

Currently, this produces "un vacaciones", "los vacacioneses", which is incorrect. I would like to correct all these entries like this:

oper vacación_N : N = mkN "vacación" "vacaciones" Fem ;

lin day_off_CN : CN = UseN vacación_N ; …

lin vacation_1b_N = vacación_N ;

I perceive GF-wordnet as a low-level lexical resource, and that it's the responsibility of an application grammarian to make sure to use holiday_1_N in a plural NP. Is this the way you have thought of GF-wordnet as well, or do you have other visions? I would be okay with oper vacación_N : N = mkN "vacaciones" "vacaciones" Fem as well, with the inconvenience that we'd get "una vacaciones" which doesn't seem correct. But it's still better than "los vacacioneses".

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-wordnet/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZGLQ3DNDIVLDWMMX23RZR4JVANCNFSM4OO22SUQ .

inariksit commented 4 years ago

Alright, if you are planning grammar extensions that handle plurale tantums, then I'll wait for them! :) I'm not in any hurry, I was just playing around one day with Spanish, and noticed these things. I think I will be using Wordnet lexicon in the future a lot, but so far only for English.