acoli-repo / acoli-dicts

3000+ machine-readable open source dictionaries distributed by the Applied Computational Linguistics lab at the University of Augsburg, Germany, and by the research group Linked Open Dictionaries (LiODi, funded 2015-2020 by BMBF at Goethe University Frankfurt, Germany). All data provided in OntoLex-Lemon and TIAD-TSV.
Apache License 2.0
10 stars 2 forks source link

Apertium RDF - different PoS in translations #9

Open jogracia opened 4 years ago

jogracia commented 4 years ago

There is a substantial amount of translations which have words of different PartsOfSpeech. There are almost 105k+ unique such entries (counting bidirectional edges only once). However, most of these (104k+) seem to be matchings from Noun to ProperNoun. Moreover it seems from a quick scroll that these noun <-> properNoun matches are actually because some properNouns are misclassified as nouns.

In general one can assume (in Apertium) equal PoS in both parts of the translations (and if there were any, this should be marginal), thus this is definitely an anomaly.

I am attaching a text file with the detected cases. [Credit: The issue was initially reported and the file created by Shashwat Goel in the context of a Google Summer of Code project] DiffPOS.txt

chiarcos commented 4 years ago

Actually, the Apertium parts of speech are a complicated matter, because
(if I undestand correctly how Apertium works), they aren't actually meant
to be
parts of speech, but triggers for bilingual morphological
transformation rules. They can happen to represent parts of speech, but
they can also be somewhat idiosyncratic. And even if they appear to be
parts of speech, it is possible that the rules they represent apply to a
broader (or smaller) group of words than what normally would be considered
to fall under a particular part of speech.

So, proper noun - noun "mismatches" may originate from language pairs
where the treatment of proper nouns is morphologically identical to that
of nouns.

I strongly advise against changing anything in the Apertium data here
because we cannot control the effects that such a change has if this data
is then used again in Apertium.

So, we have the following options: (a) "fix" the data here but make sure that it won't flow back into
Apertium, (b) fix the data in Apertium (by issues or pull requests against their
GitHub repos), or (c) live with the current imperfections.

The problem primarily arises because we want to (explicitly or imlicitly)
merge single-language dictionaries of multiple bidictionaries, but these
bidictionaries differ in their language-specific definitions in Apertium
for different language pairs. As you're working on the integration of the
TIAD technology and this data in Apertium, (a) is not an option. Option
(b) means a lot of work on the Apertium side, and (c) means that the
implicit merging we currently apply is not possible. I put Francis Tyers
in CC to ask for the Apertium perspective on that issue.

As a possible solution, we can refrain from implicit merging monolingual
dictionaries and assert identity (owl:sameAs) or near-identity
(skos:broader, etc.) between different lexemes in the OntoLex data.
Without touching the source data, this seems to be the only viable option.
The current practice (implicit merging) induces a certain level of noise
as you correctly pointed out. (I can actually live with that, too, but it
can affect down-stream applications, e.g. because of unsuspected
duplicates.)

BTW: This may be one use case that calls for multiple parts of speech per
lexical entry in OntoLex.

Am .08.2020, 12:05 Uhr, schrieb Jorge Gracia notifications@github.com:

There is a substantial amount of translations which have words of
different PartsOfSpeech. There are almost 105k+ unique such entries
(counting >bidirectional edges only once). However, most of these
(104k+) seem to be matchings from Noun to ProperNoun. Moreover it seems
from a quick scroll >that these noun <-> properNoun matches are actually
because some properNouns are misclassified as nouns.

In general one can assume (in Apertium) equal PoS in both parts of the
translations (and if there were any, this should be marginal), thus this
is definitely >an anomaly.

I am attaching a text file with the detected cases. [Credit: The issue
was initially reported and the file created by Shashwat Goel in the
context of a Google >Summer of Code project] DiffPOS.txt

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

jogracia commented 4 years ago

Hi Christian,

I think that implicit merging at the lexicon level has many more advantages than disadvantages, so I'd opt for option "c". But I am not sure that the problem is in the way Apertium dictionaries are built and not in the RDF conversion. We have to find out where the issue really is (maybe in both ends!). We need a closer inspection of some cases starting from the source data, to confirm or discard any issue with the initial RDF conversion scripts or the latter mapping into lexinfo or both

chiarcos commented 4 years ago

Actually, much of this seems to be granularity issues rather than actual errors, cf.

error or mismatch? source of error comment
103936 properNoun noun mismatch granularity properNoun isA noun
85 reflexivePersonalPronoun pronoun mismatch granularity reflexivePersonalPronoun isA pronoun
75 relativePronoun pronoun mismatch granularity relativePronoun isA pronoun
68 verb presentParticipleAdjective mismatch granularity presentParticipleAdjective isA verb
142 verb adjective mismatch overlap presentParticipleAdjective isA (verb and adjective)
48 relativePronoun noun error? typically, pronouns are nominal and can be treated as such
12 pronoun noun error? typically, pronouns are nominal and can be treated as such
33 relativePronoun adverb error? adverbs and relativePronouns can both serve to mark subordination
8 pronoun adverb error? pronouns and adverbs may overlap, e.g., in pronominal adverbs (German "deswegen", "daher")
3 personalPronoun determiner error? demonstratives and pronouns may have the same form, e.g., German "die" (definite article and demonstrative pronoun)
1 verb noun error? nominalization?
3 noun adverb error? adverbs and nouns may be formally identical, e.g., German "Ernst"
1 noun adjective error? adverbs and nouns may be formally identical, e.g., German "ernst"
248 verb pronoun error? verbs can carry pronominal clitics
4 pronoun presentParticipleAdjective error? verbs can carry pronominal clitics
4 presentParticipleAdjective modal error? verbs can carry pronominal clitics
chiarcos commented 4 years ago

prepping error analysis:

lang1 lang2 word1 word2
relativePronoun-noun
ca oc "el qual"-relativePronoun-ca "lo que"-noun-oc
ca oc "el que"-relativePronoun-ca "lo que"-noun-oc
ca oc "el qui"-relativePronoun-ca "lo que"-noun-oc
ca oc "el qui"-relativePronoun-ca "lo qui"-noun-oc
en es "the one which"-relativePronoun-en "el que"-noun-es
en es "the one who"-relativePronoun-en "el que"-noun-es
en es "the one who"-relativePronoun-en "quien"-noun-es
en es "the ones that"-relativePronoun-en "el que"-noun-es
en es "the ones who"-relativePronoun-en "el que"-noun-es
en es "those that"-relativePronoun-en "cuanto"-noun-es
en es "those that"-relativePronoun-en "el que"-noun-es
en es "those who"-relativePronoun-en "quien"-noun-es
en es "what"-relativePronoun-en "cuanto"-noun-es
en es "what"-relativePronoun-en "lo que"-noun-es
en es "which"-relativePronoun-en "el cual"-noun-es
en es "who"-relativePronoun-en "quien"-noun-es
en es "whom"-relativePronoun-en "quien"-noun-es
en es "whose"-relativePronoun-en "cuyo"-noun-es
eo en "tiuj kiuj"-relativePronoun-eo "the ones that"-noun-en
eo en "tiuj kiuj"-relativePronoun-eo "the ones who"-noun-en
es en "cuanto"-relativePronoun-es "those that"-noun-en
es en "cuanto"-relativePronoun-es "what"-noun-en
es en "el cual"-relativePronoun-es "which"-noun-en
es en "el que"-relativePronoun-es "the one which"-noun-en
es en "el que"-relativePronoun-es "the one who"-noun-en
es en "el que"-relativePronoun-es "the ones that"-noun-en
es en "el que"-relativePronoun-es "the ones who"-noun-en
es en "el que"-relativePronoun-es "those that"-noun-en
es en "lo cual"-relativePronoun-es "which"-noun-en
es en "lo que"-relativePronoun-es "what"-noun-en
es en "que"-relativePronoun-es "which"-noun-en
es en "quien"-relativePronoun-es "the one who"-noun-en
es en "quien"-relativePronoun-es "those who"-noun-en
es oc "quien"-relativePronoun-es "qui"-noun-oc
fr ca "combien"-relativePronoun-fr "quant"-noun-ca
fr ca "qui"-relativePronoun-fr "qui"-noun-ca
fr es "celui dont"-relativePronoun-fr "aquel del que"-noun-es
fr es "celui que"-relativePronoun-fr "el que"-noun-es
fr es "celui qui"-relativePronoun-fr "el que"-noun-es
fr es "qui"-relativePronoun-fr "quien"-noun-es
fr oc "celui que"-relativePronoun-fr "aquel qui"-noun-oc
fr oc "celui que"-relativePronoun-fr "lo que"-noun-oc
fr oc "celui que"-relativePronoun-fr "lo qui"-noun-oc
fr oc "celui qui"-relativePronoun-fr "aquel qui"-noun-oc
fr oc "celui qui"-relativePronoun-fr "lo que"-noun-oc
fr oc "celui qui"-relativePronoun-fr "lo qui"-noun-oc
fr oc "qui"-relativePronoun-fr "qui"-noun-oc
oc es "qui"-relativePronoun-oc "quien"-noun-es
pronoun-noun
ca en "el qual"-pronoun-ca "which"-noun-en
ca en "el que"-pronoun-ca "the one which"-noun-en
ca en "el que"-pronoun-ca "the ones that"-noun-en
ca en "el que"-pronoun-ca "those that"-noun-en
ca en "el que"-pronoun-ca "what"-noun-en
ca en "què"-pronoun-ca "what"-noun-en
ca en "que"-pronoun-ca "which"-noun-en
ca en "què"-pronoun-ca "which"-noun-en
ca en "quin"-pronoun-ca "what"-noun-en
ca en "quin"-pronoun-ca "which"-noun-en
es en "cuál"-pronoun-es "which"-noun-en
es en "qué"-pronoun-es "what"-noun-en
relativePronoun-adverb
ca fr "quan"-relativePronoun-ca "quand"-adverb-fr
ca fr "quant"-relativePronoun-ca "combien"-adverb-fr
ca oc "on"-relativePronoun-ca "ont"-adverb-oc
en eo "where"-relativePronoun-en "kie"-adverb-eo
en es "those that"-relativePronoun-en "cuanto"-adverb-es
en es "what"-relativePronoun-en "cuanto"-adverb-es
en es "where"-relativePronoun-en "adonde"-adverb-es
en es "where"-relativePronoun-en "donde"-adverb-es
en es "wherein"-relativePronoun-en "donde"-adverb-es
eo en "kie"-relativePronoun-eo "where"-adverb-en
eo es "kiam"-relativePronoun-eo "cuando"-adverb-es
eo es "kie"-relativePronoun-eo "donde"-adverb-es
eo es "kie"-relativePronoun-eo "en donde"-adverb-es
eo es "kiel"-relativePronoun-eo "como"-adverb-es
eo es "kiom"-relativePronoun-eo "cuanto"-adverb-es
es en "adonde"-relativePronoun-es "where"-adverb-en
es en "donde"-relativePronoun-es "where"-adverb-en
es eo "como"-relativePronoun-es "kiel"-adverb-eo
es eo "cuando"-relativePronoun-es "kiam"-adverb-eo
es eo "cuanto"-relativePronoun-es "kiom"-adverb-eo
es eo "donde"-relativePronoun-es "kie"-adverb-eo
es eo "en donde"-relativePronoun-es "kie"-adverb-eo
es fr "como"-relativePronoun-es "comme"-adverb-fr
es fr "cual"-relativePronoun-es "comme"-adverb-fr
es fr "cuanto"-relativePronoun-es "combien"-adverb-fr
fr ca "quand"-relativePronoun-fr "quan"-adverb-ca
fr es "combien"-relativePronoun-fr "cuanto"-adverb-es
fr es "comme"-relativePronoun-fr "cual"-adverb-es
fr oc "combien"-relativePronoun-fr "quant"-adverb-oc
fr oc "où"-relativePronoun-fr "ont"-adverb-oc
oc es "coma"-relativePronoun-oc "cual"-adverb-es
oc fr "ont"-relativePronoun-oc "où"-adverb-fr
oc fr "quant"-relativePronoun-oc "combien"-adverb-fr
pronoun-adverb
ca oc "hi"-pronoun-ca "i"-adverb-oc
eo fr "pri tio"-pronoun-eo "en"-adverb-fr
oc ca "çai"-pronoun-oc "hi"-adverb-ca
oc ca "i"-pronoun-oc "hi"-adverb-ca
oc ca "lai"-pronoun-oc "hi"-adverb-ca
oc fr "en"-pronoun-oc "en"-adverb-fr
oc fr "i"-pronoun-oc "y"-adverb-fr
oc fr "ne"-pronoun-oc "en"-adverb-fr
personalPronoun-determiner
ca oc "en"-personalPronoun-ca "en"-determiner-oc
ca oc "en"-personalPronoun-ca "eth"-determiner-oc
ca oc "en"-personalPronoun-ca "lo"-determiner-oc
verb-noun
es en "compartir"-verb-es "share"-noun-en
noun-adverb
ca fr "quant"-noun-ca "combien"-adverb-fr
en es "those that"-noun-en "cuanto"-adverb-es
en es "what"-noun-en "cuanto"-adverb-es
noun-adjective
en es "sleeve"-noun-en "manga"-adjective-es
verb-pronoun
en es "acclimatise"-verb-en "aclimatarse"-pronoun-es
en es "adapt"-verb-en "amoldarse"-pronoun-es
en es "agglomerate"-verb-en "aglomerarse"-pronoun-es
en es "alight"-verb-en "apearse"-pronoun-es
en es "alight"-verb-en "posarse"-pronoun-es
en es "amalgamate"-verb-en "amalgamarse"-pronoun-es
en es "apologise"-verb-en "disculparse"-pronoun-es
en es "approach"-verb-en "acercarse"-pronoun-es
en es "ask after"-verb-en "interesarse"-pronoun-es
en es "be called"-verb-en "apellidarse"-pronoun-es
en es "be damaged by hail"-verb-en "apedrearse"-pronoun-es
en es "be due to"-verb-en "deberse a"-pronoun-es
en es "be in perfect condition"-verb-en "mantenerse en perfectas condiciones"-pronoun-es
en es "be intimidated"-verb-en "acobardarse"-pronoun-es
en es "be offended"-verb-en "agraviarse"-pronoun-es
en es "be overwhelmed"-verb-en "anonadarse"-pronoun-es
en es "be ruined"-verb-en "arruinarse"-pronoun-es
en es "be scared"-verb-en "amedrentarse"-pronoun-es
en es "be skilful"-verb-en "amañarse"-pronoun-es
en es "be supported by"-verb-en "mantenerse con"-pronoun-es
en es "be supported"-verb-en "mantenerse"-pronoun-es
en es "be too clever by half"-verb-en "pasarse de listo"-pronoun-es
en es "become affected"-verb-en "amanerarse"-pronoun-es
en es "become Americanised"-verb-en "americanizarse"-pronoun-es
en es "become angry"-verb-en "enfadarse"-pronoun-es
en es "become attached to"-verb-en "apegarse a"-pronoun-es
en es "become brutalised"-verb-en "animalizarse"-pronoun-es
en es "become distressed"-verb-en "acongojarse"-pronoun-es
en es "become friends with"-verb-en "amistarse con"-pronoun-es
en es "become friends"-verb-en "amigarse"-pronoun-es
en es "become friends"-verb-en "amistarse"-pronoun-es
en es "become interested"-verb-en "interesarse"-pronoun-es
en es "become sleepy"-verb-en "adormecerse"-pronoun-es
en es "bend down"-verb-en "agacharse"-pronoun-es
en es "bustle about"-verb-en "ajetrearse"-pronoun-es
en es "calm down"-verb-en "amansarse"-pronoun-es
en es "care"-verb-en "preocuparse"-pronoun-es
en es "cling"-verb-en "aferrarse"-pronoun-es
en es "collapse"-verb-en "desplomarse"-pronoun-es
en es "come away"-verb-en "apartarse"-pronoun-es
en es "come off"-verb-en "desprenderse"-pronoun-es
en es "complain"-verb-en "quejarse"-pronoun-es
en es "concern"-verb-en "preocuparse"-pronoun-es
en es "congregate"-verb-en "congregarse"-pronoun-es
en es "creep"-verb-en "arrastrarse"-pronoun-es
en es "dare"-verb-en "atreverse"-pronoun-es
en es "deteriorate"-verb-en "deteriorarse"-pronoun-es
en es "dissolve into tears"-verb-en "anegarse en llanto"-pronoun-es
en es "dive"-verb-en "zambullirse"-pronoun-es
en es "dress up"-verb-en "acicalarse"-pronoun-es
en es "dress up"-verb-en "disfrazarse"-pronoun-es
en es "drown"-verb-en "ahogarse"-pronoun-es
en es "dwell on"-verb-en "obsesionarse con"-pronoun-es
en es "dwell upon"-verb-en "obsesionarse con"-pronoun-es
en es "edge away from"-verb-en "alejarse poco a poco de"-pronoun-es
en es "elaborate on"-verb-en "extenderse sobre"-pronoun-es
en es "escape"-verb-en "evadirse"-pronoun-es
en es "excuse"-verb-en "disculparse"-pronoun-es
en es "extend"-verb-en "extenderse"-pronoun-es
en es "face up to"-verb-en "apechugarse con"-pronoun-es
en es "fade away"-verb-en "desvanecerse"-pronoun-es
en es "fade"-verb-en "apagarse"-pronoun-es
en es "feel dizzy"-verb-en "marearse"-pronoun-es
en es "flood"-verb-en "anegarse"-pronoun-es
en es "fork"-verb-en "bifurcarse"-pronoun-es
en es "get a complex"-verb-en "acomplejarse"-pronoun-es
en es "get along with"-verb-en "llevarse bien con"-pronoun-es
en es "get along"-verb-en "llevarse bien"-pronoun-es
en es "get angry"-verb-en "cabrearse"-pronoun-es
en es "get angry"-verb-en "enfadarse"-pronoun-es
en es "get anxious about"-verb-en "angustiarse por"-pronoun-es
en es "get anxious"-verb-en "angustiarse"-pronoun-es
en es "get away"-verb-en "escaparse"-pronoun-es
en es "get bored"-verb-en "amuermarse"-pronoun-es
en es "get dirty"-verb-en "ensuciarse"-pronoun-es
en es "get dressed"-verb-en "vestirse"-pronoun-es
en es "get excited"-verb-en "apasionarse"-pronoun-es
en es "get flu"-verb-en "agriparse"-pronoun-es
en es "get in the quarter finals"-verb-en "meterse en cuartos de final"-pronoun-es
en es "get in the quarter finals"-verb-en "meterse en los cuartos de final"-pronoun-es
en es "get in trouble"-verb-en "meterse en problemas"-pronoun-es
en es "get into knots"-verb-en "anudarse"-pronoun-es
en es "get married"-verb-en "casarse"-pronoun-es
en es "get much bigger"-verb-en "agigantarse"-pronoun-es
en es "get obsessed with"-verb-en "obsesionarse con"-pronoun-es
en es "get obsessed"-verb-en "obsesionarse"-pronoun-es
en es "get off"-verb-en "apearse"-pronoun-es
en es "get ready"-verb-en "aparejarse"-pronoun-es
en es "get ready"-verb-en "prepararse"-pronoun-es
en es "get smaller"-verb-en "achicarse"-pronoun-es
en es "get tight"-verb-en "amarrarse"-pronoun-es
en es "get tired"-verb-en "cansarse"-pronoun-es
en es "get up"-verb-en "levantarse"-pronoun-es
en es "get wrinkled"-verb-en "ajarse"-pronoun-es
en es "give off"-verb-en "desprenderse"-pronoun-es
en es "go African"-verb-en "africanizarse"-pronoun-es
en es "go crazy"-verb-en "alocarse"-pronoun-es
en es "go French"-verb-en "afrancesarse"-pronoun-es
en es "go into action"-verb-en "ponerse en acción"-pronoun-es
en es "go over to the enemy"-verb-en "pasarse al enemigo"-pronoun-es
en es "go to bed"-verb-en "acostarse"-pronoun-es
en es "go to rack and ruin"-verb-en "echarse a perder"-pronoun-es
en es "go to rack and ruin"-verb-en "venirse abajo"-pronoun-es
en es "go too far"-verb-en "pasarse de la raya"-pronoun-es
en es "grieve about"-verb-en "apenarse de"-pronoun-es
en es "grieve about"-verb-en "apenarse por"-pronoun-es
en es "grieve"-verb-en "apenarse"-pronoun-es
en es "grow drowsy"-verb-en "aletargarse"-pronoun-es
en es "have a lump in the throat"-verb-en "anudarse la voz"-pronoun-es
en es "have a roll in the hay"-verb-en "darse un revolcón"-pronoun-es
en es "have a roll in the hay"-verb-en "pegarse un revolcón"-pronoun-es
en es "have an accident"-verb-en "accidentarse"-pronoun-es
en es "head to"-verb-en "dirigirse a"-pronoun-es
en es "heal"-verb-en "curarse"-pronoun-es
en es "hide"-verb-en "agazaparse"-pronoun-es
en es "hold on to"-verb-en "agarrarse a"-pronoun-es
en es "hold on to"-verb-en "agarrarse de"-pronoun-es
en es "hold on"-verb-en "agarrarse"-pronoun-es
en es "hurry"-verb-en "darse prisa"-pronoun-es
en es "increase"-verb-en "acrecentarse"-pronoun-es
en es "interest"-verb-en "interesarse"-pronoun-es
en es "keep fit"-verb-en "mantenerse en forma"-pronoun-es
en es "keep in touch"-verb-en "mantenerse en contacto"-pronoun-es
en es "keep out of the reach of children"-verb-en "mantenerse fuera del alcance de los niños"-pronoun-es
en es "keep to the left"-verb-en "mantenerse a la izquierda"-pronoun-es
en es "keep to the right"-verb-en "mantenerse a la derecha"-pronoun-es
en es "keep up to date"-verb-en "mantenerse al día"-pronoun-es
en es "lean out"-verb-en "asomarse"-pronoun-es
en es "live on"-verb-en "mantenerse a base de"-pronoun-es
en es "make a mistake"-verb-en "equivocarse"-pronoun-es
en es "make angry"-verb-en "enfadarse"-pronoun-es
en es "manage"-verb-en "agenciarse"-pronoun-es
en es "manage"-verb-en "apañarse"-pronoun-es
en es "marvel"-verb-en "maravillarse"-pronoun-es
en es "mate"-verb-en "aparearse"-pronoun-es
en es "meddle"-verb-en "inmiscuirse"-pronoun-es
en es "mingle"-verb-en "mezclarse"-pronoun-es
en es "mix"-verb-en "mezclarse"-pronoun-es
en es "move aside"-verb-en "apartarse"-pronoun-es
en es "move away"-verb-en "alejarse"-pronoun-es
en es "move over"-verb-en "apartarse"-pronoun-es
en es "move over"-verb-en "moverse"-pronoun-es
en es "move"-verb-en "moverse"-pronoun-es
en es "narrow"-verb-en "angostarse"-pronoun-es
en es "near"-verb-en "acercarse"-pronoun-es
en es "overwhelm"-verb-en "agobiarse"-pronoun-es
en es "pass off as"-verb-en "hacerse pasar por"-pronoun-es
en es "play it cool"-verb-en "alivianarse"-pronoun-es
en es "plummet"-verb-en "desplomarse"-pronoun-es
en es "prepare"-verb-en "prepararse"-pronoun-es
en es "rack brain"-verb-en "estrujarse el meollo"-pronoun-es
en es "rack brain"-verb-en "estrujarse la mollera"-pronoun-es
en es "realise"-verb-en "darse cuenta"-pronoun-es
en es "rebel"-verb-en "rebelarse"-pronoun-es
en es "remain awake"-verb-en "mantenerse despierto"-pronoun-es
en es "remain in existence"-verb-en "mantenerse en vigor"-pronoun-es
en es "remain in force"-verb-en "mantenerse en vigor"-pronoun-es
en es "remain stable"-verb-en "mantenerse estable"-pronoun-es
en es "resemble"-verb-en "parecerse a"-pronoun-es
en es "resent"-verb-en "molestarse"-pronoun-es
en es "retire"-verb-en "retirarse"-pronoun-es
en es "riot"-verb-en "amotinarse"-pronoun-es
en es "rise up"-verb-en "sublevarse"-pronoun-es
en es "run away"-verb-en "ahuyentarse"-pronoun-es
en es "run for"-verb-en "mantenerse en cartelera"-pronoun-es
en es "rush forward"-verb-en "abalanzarse"-pronoun-es
en es "rush"-verb-en "apresurarse"-pronoun-es
en es "seek protection"-verb-en "ampararse"-pronoun-es
en es "settle"-verb-en "asentarse"-pronoun-es
en es "settle"-verb-en "posarse"-pronoun-es
en es "shudder"-verb-en "estremecerse"-pronoun-es
en es "sink to the bottom"-verb-en "irse al fondo"-pronoun-es
en es "sink"-verb-en "hundirse"-pronoun-es
en es "slump down"-verb-en "hundirse"-pronoun-es
en es "slump"-verb-en "desplomarse"-pronoun-es
en es "speed up"-verb-en "agilizarse"-pronoun-es
en es "spend the whole day"-verb-en "pasarse todo el día"-pronoun-es
en es "stand up"-verb-en "mantenerse en pie"-pronoun-es
en es "stay"-verb-en "quedarse"-pronoun-es
en es "steady"-verb-en "estabilizarse"-pronoun-es
en es "steel"-verb-en "acorazarse"-pronoun-es
en es "stick to"-verb-en "aferrarse a"-pronoun-es
en es "stiffen"-verb-en "agarrotarse"-pronoun-es
en es "stop feeling dizzy"-verb-en "pasarse el mareo"-pronoun-es
en es "store in a cool dry place"-verb-en "mantenerse en un lugar fresco y seco"-pronoun-es
en es "support"-verb-en "solidarizarse con"-pronoun-es
en es "support"-verb-en "solidarizarse"-pronoun-es
en es "surrender"-verb-en "rendirse"-pronoun-es
en es "take refuge"-verb-en "refugiarse"-pronoun-es
en es "take responsibility"-verb-en "responsabilizarse"-pronoun-es
en es "take the air"-verb-en "airearse"-pronoun-es
en es "tidy up"-verb-en "adecentarse"-pronoun-es
en es "tire"-verb-en "cansarse"-pronoun-es
en es "train"-verb-en "adiestrarse"-pronoun-es
en es "train"-verb-en "capacitarse"-pronoun-es
en es "turn sour"-verb-en "agriarse"-pronoun-es
en es "use all its skill"-verb-en "emplearse a fondo"-pronoun-es
en es "wobble"-verb-en "tambalearse"-pronoun-es
en es "wonder"-verb-en "preguntarse"-pronoun-es
en es "worry"-verb-en "preocuparse"-pronoun-es
fr ca "amaigrir"-verb-fr "aprimar"-pronoun-ca
fr ca "aventurer"-verb-fr "arriscar"-pronoun-ca
fr ca "avoir honte"-verb-fr "avergonyir"-pronoun-ca
fr ca "bénéficier"-verb-fr "beneficiar"-pronoun-ca
fr ca "bouger"-verb-fr "moure"-pronoun-ca
fr ca "bourgeonner"-verb-fr "florir"-pronoun-ca
fr ca "cibler"-verb-fr "orientar a"-pronoun-ca
fr ca "débarrasser"-verb-fr "desfer"-pronoun-ca
fr ca "décéder"-verb-fr "morir"-pronoun-ca
fr ca "défaire"-verb-fr "desfer"-pronoun-ca
fr ca "enamourer"-verb-fr "enamorar"-pronoun-ca
fr ca "enfiler"-verb-fr "enfilar"-pronoun-ca
fr ca "entretenir"-verb-fr "mantenir"-pronoun-ca
fr ca "épouser"-verb-fr "casar amb"-pronoun-ca
fr ca "être entretenu"-verb-fr "mantenir"-pronoun-ca
fr ca "être entretenue"-verb-fr "mantenir"-pronoun-ca
fr ca "être entretenues"-verb-fr "mantenir"-pronoun-ca
fr ca "être entretenus"-verb-fr "mantenir"-pronoun-ca
fr ca "faire honte"-verb-fr "avergonyir"-pronoun-ca
fr ca "fleurir"-verb-fr "florir"-pronoun-ca
fr ca "fondre"-verb-fr "desfer"-pronoun-ca
fr ca "frémir"-verb-fr "estremir"-pronoun-ca
fr ca "grimper"-verb-fr "enfilar"-pronoun-ca
fr ca "grumeler"-verb-fr "agrumollar"-pronoun-ca
fr ca "hasarder"-verb-fr "arriscar"-pronoun-ca
fr ca "infiltrer"-verb-fr "infiltrar"-pronoun-ca
fr ca "maigrir"-verb-fr "aprimar"-pronoun-ca
fr ca "maintenir"-verb-fr "mantenir"-pronoun-ca
fr ca "mettre à la retraite"-verb-fr "jubilar"-pronoun-ca
fr ca "mettre en exergue"-verb-fr "posar com a explicació"-pronoun-ca
fr ca "moisir"-verb-fr "florir"-pronoun-ca
fr ca "mourir"-verb-fr "morir"-pronoun-ca
fr ca "mouvoir"-verb-fr "moure"-pronoun-ca
fr ca "nouyauter"-verb-fr "infiltrar"-pronoun-ca
fr ca "noyauter"-verb-fr "infiltrar"-pronoun-ca
fr ca "partir en retraite"-verb-fr "jubilar"-pronoun-ca
fr ca "pourrir"-verb-fr "podrir"-pronoun-ca
fr ca "prendre sa retraite"-verb-fr "jubilar"-pronoun-ca
fr ca "rejoindre"-verb-fr "unir a"-pronoun-ca
fr ca "remuer"-verb-fr "moure"-pronoun-ca
fr ca "rendre amoureuse"-verb-fr "enamorar"-pronoun-ca
fr ca "rendre amoureuses"-verb-fr "enamorar"-pronoun-ca
fr ca "rendre amoureux"-verb-fr "enamorar"-pronoun-ca
fr ca "risquer"-verb-fr "arriscar"-pronoun-ca
fr ca "tomber amoureuse"-verb-fr "enamorar"-pronoun-ca
fr ca "tomber amoureuses"-verb-fr "enamorar"-pronoun-ca
fr ca "tomber amoureux"-verb-fr "enamorar"-pronoun-ca
fr ca "tressaillir"-verb-fr "estremir"-pronoun-ca
pronoun-presentParticipleAdjective
es en "cansarse"-pronoun-es "tire"-presentParticipleAdjective-en
es en "hundirse"-pronoun-es "sink"-presentParticipleAdjective-en
es en "solidarizarse con"-pronoun-es "support"-presentParticipleAdjective-en
es en "solidarizarse"-pronoun-es "support"-presentParticipleAdjective-en
presentParticipleAdjective-modal
en ca "have to"-presentParticipleAdjective-en "haver de"-modal-ca
en es "have to"-presentParticipleAdjective-en "deber"-modal-es
en es "have to"-presentParticipleAdjective-en "haber de"-modal-es
en es "have to"-presentParticipleAdjective-en "tener que"-modal-es
jubosgil commented 4 years ago

Hi. Thank you for this analysis. Building on Christian's table here above, I have taken these pairs and compared them in source data > intermediate RDF > Apertium list of tags + mapping table > SPARQL endpoint to locate the source of the difference.

Most differences are due to... a) ...a wrong mapping of the apertium tags apertium:nn and apertium:pron. My mapping for subclasses of proper nouns was too broad as well. Mapping table corrected now with Shashwat Goel's hints and the Apertium documentation. b) ...entries that show more than one POS in the source data, which is reflected in the RDF . c) ...a combination of a) and b).

I have uploaded a detailed report with my analysis on these different pairs (noun vs. pronerNoun, pronoun vs. noun, etc.) with examples from the source data.