Closed marcoagpinto closed 10 months ago
DAN[1]: crime[crime/AQ0CS0,crime/NCMS000] -> crime[crime/AQ0CS0]
DET-NOUN_PRON-VERB[3]: crime[crime/AQ0CS0] -> crime[crime/AQ0CS0]
DAN[1]: toma[toma/NCCS000,tomar/VMIP3S0,tomar/VMM02S0] -> toma[toma/NCCS000]
We're aware that some disambiguator rules can be a little aggressive, but a lot of patterns might depend on them now, so modifying them might be risky. Is there a specific rule that this disambiguator issue is causing?
@p-goulart
Yes, I was improving an academic rule and it breaks the rule with the new examples:
<rule id='TOMAR_ASSUMIR' name="[Universitário][Científico] V. Tomar → V. Assumir" tone_tags="academic" is_goal_specific="true">
<pattern>
<token postag='SENT_START|AQ.+|NC.+|NP.+|CS|CC' postag_regexp='yes'/>
<marker>
<token inflected='yes' regexp='yes'>tomar
<exception scope='previous' postag_regexp='yes' postag='V.+|PP.+'/>
<exception scope='previous' regexp='yes'>decis(ão|ões)</exception>
</token>
</marker>
<token min='0' max='2' postag='SPS00|(SPS00:)?[DP][ADIPRT].+|RG' postag_regexp='yes'/>
<token regexp='yes'>cert[ao]s?|determinad[ao]s?|diferentes?|divers[ao]s?|enormes?|formas?|imens[ao]s?|inúmer[ao]s|múltipl[ao]s|vári[ao]s|variad[ao]s</token>
<token postag='AQ.+|NC.+|PI.+' postag_regexp='yes'>
<exception regexp='yes' inflected='yes'>bebida|café|caneca|cerveja|chá|colher|copo|drink|frasco|garfo|garrafa|garrafão|xícara|shot|su[mc]o|vinho|gelado|sorvete|blíster|caixa|comprimido|contracetivo|embalagem|medicação|medicamento|pílula|remédio|autocarro|automóvel|avião|carrinha|carro|jato|ônibus|táxi|veículo|comboio|trem|voo|barc[ao]|bote|canoa|ferry|banho|duche</exception> <!-- Add more words as they are found -->
</token>
</pattern>
<message>Num contexto formal/científico, é preferível escrever "assumir".</message>
<suggestion><match no='2' postag='V.+' postag_regexp='yes'>assumir</match></suggestion>
<example correction="assume">O crime <marker>toma</marker> formas impensáveis.</example>
<example correction="assume">O crime <marker>toma</marker> formas diversas.</example>
<example correction="assume">O crime é perigoso e o seu financiamento <marker>toma</marker> diversas formas.</example>
<example correction="assume">O crime é perigoso e o seu financiamento <marker>toma</marker> as mais diversas formas.</example>
</rule>
Pedro,
Maybe you could only improve the disambiguator for this verb “tomar”?
This way it will be less risky in a global scale.
Sorry, this isn't a priority for right now. I can take care of this issue at some point later, but if you're keen on seeing this working why don't you dive into the disambiguator? It's pretty much like editing rules.
Pedro, the last time I touched disambiguator several years ago, I “screwed” it up.
I would rather not risk it.
I will ask for the help of Jaume: @jaumeortola
Heya, Jaume, can you help?
Thanks!
The fact this didn’t work when you tried several years ago doesn’t mean it won’t work when you try a second time! Have a little faith in yourself ;)
I suggest you give it another go, add us as reviewers, and we’ll have a look later. Whatever broke last time won't break this time, because we'll be here to prevent anything catastrophic from happening.
The fact this didn’t work when you tried several years ago doesn’t mean it won’t work when you try a second time! Have a little faith in yourself ;)
I suggest you give it another go, add us as reviewers, and we’ll have a look later. Whatever broke last time won't break this time, because we'll be here to prevent anything catastrophic from happening.
Sure, I will try it tonight.
You are right, “my lack of faith is disturbing” (Star Wars).
Heya, @p-goulart and @susanaboatto
“toma” appears as a noun instead of a verb:
O crime toma formas impensáveis.
Thanks!