languagetool-org / portuguese-pos-dict

Portuguese POS tagger
GNU Lesser General Public License v2.1
5 stars 2 forks source link

some gender issues #6

Closed jaumeortola closed 7 months ago

jaumeortola commented 2 years ago

bisavô (m) bisavó (f). But what are the proper plurals? Online dictionaries are a bit confusing. cookies (anglicism) can be both m and f? fan (anglicism) both m and f? gameta (BR m), gâmeta (m) is a scientific term; gameta (f) is a regionalism we can ignore?

@marcoagpinto @ricardojosehlima

marcoagpinto commented 2 years ago

@jaumeortola https://dicionario.priberam.org/bisav%C3%B3s https://dicionario.priberam.org/bisav%C3%B4s

https://dicionario.priberam.org/cookie it says it is female, but all my life I used the word as male

fan is both m and f.

marcoagpinto commented 2 years ago

yes, right.

cookie.

Priberam says it is female but Infopedia says it is male: https://www.infopedia.pt/dicionarios/lingua-portuguesa/cookie

Infopedia is right in this case.

jaumeortola commented 2 years ago

So:

bisavô m sg
bisavó f sg
bisavôs m pl
bisavós m pl AND f pl

Is that correct?

marcoagpinto commented 2 years ago

right.

Is that the new way of tagging words in added.txt?

I believe avós is both male and female, at least spoken in Portugal.

marcoagpinto commented 2 years ago

all seems right to me.

ricardojosehlima commented 2 years ago

bisavô (m) bisavó (f). But what are the proper plurals? Online dictionaries are a bit confusing. cookies (anglicism) can be both m and f? fan (anglicism) both m and f? gameta (BR m), gâmeta (m) is a scientific term; gameta (f) is a regionalism we can ignore?

@marcoagpinto @ricardojosehlima

In Brazil, we use "avôs" and "bisavôs" for the male plural. cookies is always m fan is not an anglicism in pt-br, we use fã, which is f and m gameta, AFAIK, is only a scientific term, m.

jaumeortola commented 2 years ago

Thank you. For fan/fans, there is a rule that suggests fã/fãs. We can keep or remove fan/fans from the dictionary (but with the mf gender) For cookie, m seems the proper gender.

I believe avós is both male and female, at least spoken in Portugal.

I don't have a solution for this. For now, I add the irregular form avós (m pl).

jaumeortola commented 2 years ago

"diabetes, diabete" is feminine in PT, but f and m in BR (Michaelis)? newsletter: f PT, m BR?

marcoagpinto commented 2 years ago

@jaumeortola

https://dicionario.priberam.org/diabete

Priberam says it is male and female.

marcoagpinto commented 2 years ago

But, Infopédia says it is female: https://www.infopedia.pt/dicionarios/lingua-portuguesa/diabete

marcoagpinto commented 2 years ago

newsletter is female in Portuguese

jaumeortola commented 2 years ago

Priberam says it is male and female.

Usually, I look out the main dictionaries (Priberam, Infopédia, Michaelis...). I already have this information. What I need is your opinion to decide what to do (allow both genders, create separate rules for PT/BR, and so on). For example, your opinion on "cookie" make me choose only masculine, although one dictionary said feminine.

marcoagpinto commented 2 years ago

@jaumeortola

Unfortunately, it appears that today I won't have much time to work on LanguageTool.

I had to go to my job to get some documents for my disability retirement, I have to scan them and write an e-mail to the doctor, and I need to make the Friday backups of files.

If I wake up early tomorrow, I will do some things before work, but the most probable is that I will only have the chance on Monday in my 5 days off.

Sorry.

ricardojosehlima commented 2 years ago

"diabetes, diabete" is feminine in PT, but f and m in BR (Michaelis)? newsletter: f PT, m BR?

hi @jaumeortola in br, diabete either singular or plural can be both m and f. as for newsletter, it is definitely f!

jaumeortola commented 2 years ago

There is some vacillation in the gender of 'flarte' (m/f?). Is it an error? The Brazilian variant (flerte) seems to be always m.

ricardojosehlima commented 2 years ago

For Brazilian, correct, only flerte always m