Many similarity metrics use a lexical semantic resource for computing similarities, e.g. WordNet or Wiktionary. These resources are loaded using DKPro LSR (https://github.com/dkpro/lsr).
We can replace DKPro LSR with UBY to be able to use more resources and potentially combine information from different resources. This requires something like a "semantic" interoperability layer.
The following text is collected from a discussion between Tosten and Judith:
With "semantic" interoperability layer, I meant that LSR was mainly designed
for use in semantic relatedness computation - probably a different
reading of semantics than what you had in mind.
LSR makes quite strong assumptions regarding what are entities, relations,
etc. - i.e. it sometimes somewhat redefines the semantics e.g. of what is a
synonymy relation.
This is mainly done for Wikipedia though as the other resources are more
alike.
In Wikipedia, e.g. we define article redirects to be synonyms
My proposal is to replace (for all resources where this makes sense) the
current wrapper that relies on the native API with one that used the Uby
API.
In Wikipedia, e.g. we define article redirects to be synonyms
the converter for Uby-Wikipedia sets the redirects to RELATED:
My proposal is to replace (for all resources where this makes sense) the
current wrapper that relies on >>the native API with one that used the Uby
API.
I looked into WordNet, GermaNet, Wiktionary, and this all looks feasible.
Actually, this is an interesting exercise which might improve the UBY API
In most cases (not Wikipedia, GermaNet), LSR could then also use Uby
databases packaged as Maven artifacts.
Even OpenThesaurus might be wrappable in the near future as Christian M.
recently completed a Uby converter for that.
However, replacing the wrappers with the UBY API will take some time - also
depending on who will perform the changes.
I see at least 3 tasks:
possibly some questions in order to get the LSR - UBY mapping right
adapting the wrappers
testing the changes
altogether estimated 2 days for an experienced Uby developer which is a lot.
Alternatively, changing the wrappers one after the other, resource by
resource?
Many similarity metrics use a lexical semantic resource for computing similarities, e.g. WordNet or Wiktionary. These resources are loaded using DKPro LSR (https://github.com/dkpro/lsr).
We can replace DKPro LSR with UBY to be able to use more resources and potentially combine information from different resources. This requires something like a "semantic" interoperability layer.
The following text is collected from a discussion between Tosten and Judith:
With "semantic" interoperability layer, I meant that LSR was mainly designed for use in semantic relatedness computation - probably a different reading of semantics than what you had in mind. LSR makes quite strong assumptions regarding what are entities, relations, etc. - i.e. it sometimes somewhat redefines the semantics e.g. of what is a synonymy relation. This is mainly done for Wikipedia though as the other resources are more alike. In Wikipedia, e.g. we define article redirects to be synonyms
My proposal is to replace (for all resources where this makes sense) the current wrapper that relies on the native API with one that used the Uby API.
In Wikipedia, e.g. we define article redirects to be synonyms the converter for Uby-Wikipedia sets the redirects to RELATED:
My proposal is to replace (for all resources where this makes sense) the current wrapper that relies on >>the native API with one that used the Uby API.
I looked into WordNet, GermaNet, Wiktionary, and this all looks feasible. Actually, this is an interesting exercise which might improve the UBY API In most cases (not Wikipedia, GermaNet), LSR could then also use Uby databases packaged as Maven artifacts.
Even OpenThesaurus might be wrappable in the near future as Christian M. recently completed a Uby converter for that.
However, replacing the wrappers with the UBY API will take some time - also depending on who will perform the changes. I see at least 3 tasks:
altogether estimated 2 days for an experienced Uby developer which is a lot.
Alternatively, changing the wrappers one after the other, resource by resource?