Open danbri opened 9 years ago
Notes from IRC,
Here is how mapping can be done on the Wikidata side for example: https://www.wikidata.org/wiki/Property:P31
The JSON dumps are the best dumps.
+1
happy to help here a little! I had chance to meet few people from Wikidata crew during 31C3 and remember that serving turtle also needs some fixing... but it already uses schema.org quite a lot!
$ curl http://www.wikidata.org/entity/Q80 -iL -H "Accept: text/turtle"
I went looking for the code that generates this. For those without turtle, an excerpt from running
curl http://www.wikidata.org/entity/Q42 -iL -H "Accept: text/turtle"
(full response is at https://gist.github.com/danbri/66616096d42e595376f6 )
[update]Hmm actually you can get it all in the browser without using content negotiation, just via suffixes:
( edit! I have moved a big chunk of text to https://gist.github.com/danbri/181ff7763f479c397e10 - apologies to those who got accidental notifications due to the '@' symbol.)
This is great but also unfortunately "the easy part" in that these are fixed built-in properties that each Wikidata entry will always carry.
Looking around for relevant source code,
It would be interesting to see how addEntityMetaData might be amended to exploit equivalentProperty information in Wikidata, at @lydiapintscher mentioned re https://www.wikidata.org/wiki/Property:P31
I agree, "Schema.org should have mappings to Wikidata terms where possible". How to vote? or how to colaborate and/or check work in progress? There are a link about work in this issue?
@danbri please remember to fence code snippets with three backticks which can also include clue for syntax highlighting
```ttl
code goes here @bg @dr @mr
@prefix data: http://www.wikidata.org/wiki/Special:EntityData/ .
@prefix schema: http://schema.org/ .
no mentions using @foo
also see code tab in Examples of github markdown https://guides.github.com/features/mastering-markdown/#examples
@ppKrauss I think people would appreciate more machine readable mappings using owl:equivalentProperty etc. e.g. https://github.com/schemaorg/schemaorg/blob/d370e33a97654746e696973c7966b84b501a59dc/data/schema.rdfa#L5706
IMO we could consider everything from subset of OWL used by RDFa Vocabulary Entailment http://www.w3.org/TR/rdfa-syntax/#s_vocab_expansion
@elf-pavlik thanks (!), so the issue now is only to add something as
<link property="owl:equivalentProperty" href="http://WikiDataURL"/>
in each rdf:Property
and each rdfs:Class
... is it?
New suggestion: we may colaborate with an online interface or (initially) by a spreadsheet (ex. Excel) at github, with the columns wikidataID and Property or wikidataID and Class.
Why not add it directly in Wikidata?
@lydiapintscher , perhaps I am not understanding your point, sorry... The objetive in this issue is to map the Schema.org's definitions into the Wikidata.org's concept-definitions, not the inverse.
Both should happen, no? ;-)
@lydiapintscher , I think it is a matter of scope. You can imagine Wikidata as an (external and closed) didictionary, like Webster, not like an open project like Wiipedia.
Wikidata is just as open as Wikipedia.
Peter, 22/02/2015 18:39:
wikipedia.org concept definitions
Does such a thing exist?
@lydiapintscher once schema.org URIs have mappings to wikidata URIs added, do you see a way to add them to wikidata in programmable way? IMO it doesn't make sense to do it manualy via web UI... maybe wikidata team could just import them from schema.rdfa?
BTW I'll stay most of march ~Berlin and could meet IRL with you and anyone else from wikidata interested in this issue... Whenever in Berlin I go anyways to #OKLab / CodeForBerlin on every monday evening at Wikimedia HQ :smile: (we can discuss details over pm - just see my gh profile)
I am trying (with bad English) to consolidate this issue in a draft of the proposal, can you help?
A next step will be to create a Readme.md
for everybody edit this text, perhaps with the #352 mechanism, and (phase1) implement "by hand" some examples in schema.rdfa.
Foundations collected from comments posted in this discussion:
addEntityMetaData
might be amended to exploit equivalentProperty
information in Wikidata". <link property="owl:equivalentProperty" href="http://WikiDataURL"/>
, into each rdfs:Class
and each rdf:Property
resource definitions.equivalentProperty
is the same as showed in the Property:P31 example) of @lydiapintscher.Proposal for enhance schema.rdfa definition descriptors (rdfs:comment
) and semantics, mapping each vocabulary item to a Wikidata item.
A sibling project at Wikidata will be the Wikidata.org-to-Schema.org mapping.
PART 1 - SchemaOrg mapping to Wikidata
Actions: add <link property="{$OWL}" href="{$WikiDataURL}"/>
with the correct $WikiDataURL.
rdfs:Class
add the <link>
tag with $OWL="owl:equivalentClass"
or, when not possible, use$OWL="rdfs:subClassOf"
.rdf:Property
add the <link>
tag with $OWL="owl:equivalentProperty"
or, when not possible, use$OWL="rdfs:subPropertyOf"
.Actions on testing phase: do some with no automation. Example: start with classes Person and Organization, and its properties.
Examples
owl:equivalentClass
to Q43229.owl:equivalentClass
to Q515.owl:equivalentClass
to Q215627.owl:equivalentClass
to Q319608.PART 2 - Wikidata mapping to SchemaOrg
... under construction... see similar mappings at schema.rdfs.org/mappings.html... Wikidata also have a lot of iniciatives maping Wikidata to external vocabularies (ex. there are a map from Wikidata to BNCF Thesaurus)...
@lydiapintscher , Sorry again... I not saw that there are also a proposal of "sibling project at Wikidata" (!)... Can you please check if my "draft of this proposal" text is now on the rails? I am trying to "translate" and consolidate all comments in one document... To start all with the same scope, objective, etc.
@danbri , @elf-pavlik , and others, I not understand if there are a "formal procedure for create proposals" here...
Can you please check if my "draft of this proposal" text is now on the rails? I need your help to "translate" and consolidate it.
About automation, I still do not understand well, you want to automate? My opinion. I think we can start with non-automated procedures, that will be util to check automated ones, which happen to be introduced later... Or to check the "size" of the non-automated task (~1000 items!). I think that a reliable mapping needs human control.
@ppKrauss thanks for trying to summarize this thread into a proposal!
http://schema.org/Organization is owl:equivalentProperty to Q43229
please don't confuse owl:equivalentClass with owl:equivalentProperty
if you look at schema.rdf we need accordingly
for the automation, once we map one way schema.org -> wikidata (however we manage to do it) then we can automate importing most of that mapping into wikidata so no one needs to click and copy&paste...
Last but not least, schema.org just starts using github recently and also seems to go through various other processes, I would encourage you to stay patient and give people time to reply :smile:
Thanks all. Indeed I'm on a trip and can't currently give this the attention it deserves, but I would try to nudge the focus towards actual mappings and away from the specific implementation details at schema.org. We will be making some changes in the site tooling to support mechanisms for extension that may be relevant here.
How about we just jump into the details and start a spreadsheet with a table of schema.org types and properties? Eg on google docs...?
On Mon, 23 Feb 2015 09:06 ☮ elf Pavlik ☮ notifications@github.com wrote:
@ppKrauss https://github.com/ppKrauss thanks for trying to summarize this thread into a proposal!
http://schema.org/Organization is owl:equivalentProperty to Q43229
please don't confuse owl:equivalentClass with owl:equivalentProperty
if you look at schema.rdf https://github.com/schemaorg/schemaorg/blob/sdo-gozer/data/schema.rdfa we need accordingly
- typeof="rdfs:Class" needs owl:equivalentClass or rdfs:subClassOf
- typeof="rdf:Property" needs owl:equivalentProperty or rdfs:subPropertyOf
for the automation, once we map one way schema.org -> wikidata (however we manage to do it) then we can automate importing most of that mapping into wikidata so no one needs to click and copy&paste...
Last but not least, schema.org just starts using github recently and also seems to go through various other processes, I would encourage you to stay patient and give people time to reply [image: :smile:]
— Reply to this email directly or view it on GitHub https://github.com/schemaorg/schemaorg/issues/280#issuecomment-75584818.
@elf-pavlik thanks (!), I edited with your correction (and now coping also to my issue280 "ahead of work" :-)
@danbri Ok I send to to this googleDoc and updated my #352 with the tool that generates the spreadsheet.
@elf-pavlik and @danbri , no urgence (!). As a novice here, I am experimenting/testing the collaboration possibilities, and studing schemaOrg as a project ... Now I have a better "schema.org big picture", I see a good work(!), by moderators and vibrant community. My only help/clue about "better Github use" is at #352, and perhaps still a little messy.
Returning to talk about the spreadsheet, there are ~1500 items (!)... A good starting point is the classes Person and Organization, the "vCard semantic" is the more used in the Web,
http://webdatacommons.org/structureddata/index.html#toc2
so, I am starting to work with them (Person and Organization)... It is ok, good starting point?
Thanks. Yes starting with the more most general / common types makes sense.
Where I got stuck: I could not figure out a good programmatic way to access Wikidata's schema information in all its richness.
Maybe there is a way to take the JSON dumps, load them into some fast-access NoSQL-ish database, so that things can be searched/matched/retrieved easily?
nearby: https://gist.github.com/chrpr/23926c4650ce4363c51b dumps DBpedia's vocab (not Wikidata, but worth a look for comparison)
Wikidata provides RDF dumps here: http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150126/
It is easy to get the classes from the wikidata-taxonomy dump but needs to be joined with the wikidata-terms dump to get the labels. For properties you can use the wikidata-properties dump
If you want something more fine-grained you can try the WKDT toolkit https://github.com/Wikidata/Wikidata-Toolkit
Or create a DBpedia extractor, we have experimental support for wikidata in this branch: https://github.com/alismayilov/extraction-framework/tree/wikidataAllCommits
RDF dumps can be directly loaded in a SPARQL endpoint or easily manipulated in CLI/code and load in any store.
OK, phase1 completed! In this phase we can only to use "by hand" procedures... My basic work was,
schema.rdfa.htm
... for more details (while the corresponding fork is pending) here.
I finished my first test with report/edit/rewrite "by hand" process... And, some new (minor) problems were evidenced, a kind of normalization demand:
<link>
vs <span><a>
", seems also a normalization problem. My suggestion is to show transparently all the links to the crown, so, format link with the span
template. About item 2, countings:
<span>Subclass of: ...</span>
<link>
Question (perhaps for @elf-pavlik, but no urgence!): can I adopt the span
templates instead simple link
tag? An convert all the residual <link ..../>
also to span
?
Starting phase2: let's discuss and check the automation possibilities!
( while anybody can enhance the volume of Wikidata links at the GoogleDoc with spreadsheet of the phase1).
The first step here is to discuss about reality, that is summarized by the "schema_org_rdfa profile" (see #361).
gozer release profile (countings):
link
tag (nLinks): 105link
countings:
span
a
countings:
AUTOMATION OPPORTUNITIES:
Propagating as semantic subset: it is valid for specific items, as to say "addressLocality is a semantic subset of PostalAddress", when we can propagate the WikidataID (ex. as rdfs:subPropertyOf); but not for broader items as Thing. There are 663 (!), so, we can expect some automation here... The first step is to indicate (we can add a column in the spreadsheet) who items are "broader" (so can not be used as semantic super-classes for WikidataID).
1.1. inheriting semantic: all Property inherits the semantic of its parent Class, so, it is also a kind of "semantic subset" (and gain need to excluse the "broader cases")... There are another indirect situations in the graph? We must excluse all elected cases to excluse (later) from the spreadsheet.
link
s relating semantic definition in external vocabularies, see nLinks and countings with 'owl:equivalentClass', 'rdfs:subPropertyOf' and 'owl:equivalentProperty'. Perhaps 'dc:source', but it adds only more 24.The mapping from Schema.org types to Wikidata conceptual items seems very interesting. How is it going? I can see there has been no comment for a while. If applicable, I would like to join for the effort of mapping between these two. :)
ps. I found it hard to get the meaning of Wikidata class concepts (schema-level, not instance-level) as they use Qxxxxxx (not intuitive) terms for their conceptual items. Is there any tip to figure out what Qxxxxxxes mean in usual words?
Hello @boanuge, well come to this iniciative! It is not abandoned... Do you try to collaborate here, with the GoogleDoc with spreadsheet of the phase1?
Perhaps we (you and I at this moment) need to show "more and good results" to restart this proposal... So, you can also help here in an extesion of the spreadsheet... Them, later, when we have "critical mass" of results, we will return here.
About your PS: no, Qxxxxxx is a Wikidata's project decision, an opaque identifier have some advantages.
For a human the label and description should define the meaning in words so far as it needs to be disambiguated from other concepts. Just look the Qxxxxxx up on wikidata.org or use its API to get the label and description in your favorite language.
@ppKrauss Thank you very much. I will see what I can do. :) @JanZerebecki Thank you for the comment. I hoped there is a nice one page view for each Qxxxxxx term with label and description, such as schema.rdfa, instead of looking up one by one (there are too many Qxxxxxx to go through. :) Any comments about how Wikidata generates their items are appreciated.
There are too many items (item = Qxxxxxx) to list them all (currently more than 13 million). These are edited manually and in automated ways, see https://www.wikidata.org/wiki/Wikidata:Introduction for more information. Note that there are also properties. There is a list of all properties: https://www.wikidata.org/wiki/Wikidata:List_of_properties/all .
Example: https://www.wikidata.org/wiki/Q25169#P50 tells us: "The Hitchhiker's Guide to the Galaxy" (item Q25169) its author (property P50) is Douglas Adams (item Q42). https://www.wikidata.org/wiki/Property:P50#P1629 tells us that the property author (P50) is for the subject (P1629) item author (Q482980).
Maybe it is more useful to map to Wikidata properties instead of Wikidata items. https://schema.org/author would map to https://www.wikidata.org/wiki/Property:P50 .
Note that people already use Wikidata.org itself to do this mapping, like is done on https://www.wikidata.org/wiki/Property:P18#P1628 which says the Wikidata property image (P18) is equivalent to http://schema.org/image . These could be exported and added to schema.org which would ensure that the mapping is actually symmetric.
Yes the idea is purely to map the descriptive vocabulary (hundreds or low thousands of mainly types/properties), not millions of items.
@danbri then update the issue title.. instead of wikidata terms ... wikidata properties
@JanZerebecki, as stated by Dan, "the idea is purely to map the descriptive vocabulary", is a SchemaOrg-to-Wikidata map, and SchemaOrd have max. ~1500 items, see countings above...
The main objective is to complement the poor/imprecise descriptions (rdfs:comment
) of SchemaOrg.
(also @thadguidry) About properties like P50, in my opinion, they are like "internal database descriptors" of Wikidata, while Qxxxxxx are the entries for Wikipedia concepts. So the "author" concept is not P50, it is Q482980... The Qxxxxxx concepts are more stable and complete.
PS: the properties can generate cyclic references for SchemaOrg.
https://en.wikipedia.org/wiki/Ontology_alignment
So there are entity (class, property) resolutions (and disambiguation trees (w/ information gain))?
Mapping to wiki is a plus -Among others. The health extension is now experimenting mapping concepts to defined concepts in healthcare standards and terminologies like SNOMED CT but also to RxNorm, LOINC, and ICD.
On Sep 3, 2015 1:38 PM, "Marc" notifications@github.com wrote:
Mapping to wiki is a plus -Among others. The health extension is now experimenting mapping concepts to defined concepts in healthcare standards and terminologies like SNOMED CT but also to RxNorm, LOINC, and ICD.
— Reply to this email directly or view it on GitHub.
Quick update to make sure everyone is aware that Wikidata has a SPARQL endpoint now; linked from https://www.wikidata.org/wiki/Wikidata:Data_access#SPARQL_endpoints
Rather related: http://addshore.com/2015/12/wikidata-references-from-microdata/ from @addstore
I've been looking into how wikidata could look like as an external schema.org extension. Perhaps something like this (don't worry about the big header, eventually it would be hidden behind a simple URL). It be good if the corresponding triples were as close as possible to those in the Wikidata SPARQL endpoint.
<script type="application/ld+json">
{
"@context": {
"@vocab": "http://schema.org/",
"wd_lnbIdentifier": {"@id": "https://www.wikidata.org/entity/P1368" },
"wd_countryOfCitizenship": {"@id": "https://www.wikidata.org/entity/P27" , "@type": "@id"},
"wd_religion": {"@id": "https://www.wikidata.org/entity/P140", "@type": "@id"},
"wd_nativeLanguage": {"@id": "https://www.wikidata.org/entity/P103", "@type": "@id"}
},
"@type": "Person",
"@id": "https://www.wikidata.org/entity/Q42",
"name": "Douglas Adams",
"wd_lnbIdentifier": "000057405",
"wd_countryOfCitizenship":
{
"@type": "Country",
"@id": "https://www.wikidata.org/entity/Q145",
"name": "United Kingdom"
},
"wd_religion": {
"@id": "https://www.wikidata.org/entity/Q7066",
"name": "atheism"
},
"wd_nativeLanguage": {
"@type": "Language",
"@id": "https://www.wikidata.org/entity/Q7979",
"name": "British English"
}
}
</script>
@vrandezo and I have been exploring this some more.
For now, just a SPARQL query to try at query.wikidata.org
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?property ?ptype ?label ?extsuper ?extsub ?extequiv
WHERE {?property a wikibase:Property; rdfs:label ?label; wikibase:propertyType ?ptype .
OPTIONAL { ?property wdt:P2235 ?extsuper . }
OPTIONAL { ?property wdt:P2236 ?extsub . }
OPTIONAL { ?property wdt:P1628 ?extequiv . }
FILTER( REGEX(STR(?extequiv), "schema.org") ||
REGEX(STR(?extsub), "schema.org") ||
REGEX(STR(?extsuper), "schema.org") )
FILTER(LANG(?label) = "en")}
... this shows that Wikidata itself can be used as a registry of mappings to/from schema.org terms :)
The approach of using Wikidata to hold these mappings looks worth exploring further.
Here is another one (thanks to Denny):
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
# which properties are most commonly found on things that are
# 'instance of' (P31) the 'Cat' type (Q146)?
SELECT ?prop (count(?prop) as ?count) WHERE {
?i wdt:P31 wd:Q146 .
?i ?prop ?val .
FILTER(STRSTARTS(STR(?prop), "http://www.wikidata.org/prop/direct/"))
} group by ?prop order by desc(?count)
# TODO:
# - figure out how to get the rdfs:label of these
# - figure out how to handle v common types like human (Q5), can we sample e.g. 1000 items only?
... any help with the last parts gratefully received :)
Here is the complementary query, which finds most common properties whose value is something 'instance of' 'Cat'.
The query is written more compactly here, and has the same issues/problems as noted above:
SELECT ?prop (count(?prop) as ?count) WHERE {
# some thing with some property that is some item, where that ...
?x ?prop ?i .
# item instanceOf Cat.
?i <http://www.wikidata.org/prop/direct/P31> <http://www.wikidata.org/entity/Q146> .
FILTER(STRSTARTS(STR(?prop), "http://www.wikidata.org/prop/direct/"))
} group by ?prop order by desc(?count)
This corresponds loosly to the notion of properties whose http://schema.org/rangeIncludes is the type "Cat":
To compare, here are top results for the earlier query, i.e. properties whose domainIncludes the type "Cat". In other words, properties commonly found on items that are cats. Here is the earlier query in more compact form:
SELECT ?prop (count(?prop) as ?count) WHERE {
?i ?prop ?x . # some item has some property whose value x is an item, where that ...
# item instanceOf Cat.
?i <http://www.wikidata.org/prop/direct/P31> <http://www.wikidata.org/entity/Q146> .
FILTER(STRSTARTS(STR(?prop), "http://www.wikidata.org/prop/direct/"))
} group by ?prop order by desc(?count)
Currently this gives 45 results, the most common properties (from 68 cats in wikidata) being:
Both Wikidata and schema.org vocabularies have a relatively loose, flexible and evolving association between types and properties; Wikidata even more so. While schema.org lists a current set of incoming and outgoing properties on each type, often adjusting these over time, Wikidata does not formally do this at all. There are currently some non-machine-readable notes on the relevant talk pages but nothing exposed via RDF/SPARQL. Consequently we need to mine this information from actual descriptions (such as the 68 cat descriptions in Wikidata) to get a sense of the emergent structure. This process also gives a feel for the "long tail" of property definitions that exists in Wikidata and which we can now re-use within schema.org descriptions across the Web.
We can use this to explore the data. For example, we see that one of the most common ways in which Wikidata references the Cat type is using property P161, 'cast member'. Who are are these famous acting cats?
SELECT * WHERE {
?x <http://www.wikidata.org/prop/direct/P161> ?i ;
# ?x with a 'cast member' that is some thing ?i...
<http://www.w3.org/2000/01/rdf-schema#label> ?label .
# where ?i is 'instance of' the 'Cat' type:
?i <http://www.wikidata.org/prop/direct/P31> <http://www.wikidata.org/entity/Q146>;
<http://www.w3.org/2000/01/rdf-schema#label> ?catname ;
FILTER(LANG(?label) = "en")
FILTER(LANG(?catname) = "en")
}
From this we learn, amongst other things, of a famous cat actor, Orangey (http://www.wikidata.org/entity/Q677525) that starred in several works including versions of Breakfast at Tiffany's, The Diary of Anne Frank, Village of the Giants. The creature has an IMDB page, if you are curious: http://www.imdb.com/name/nm1248838/ . If you scan that page for ]embedded schema.org](https://developers.google.com/structured-data/testing-tool/?url=http://www.imdb.com/name/nm1248838/) you can find out more about Orangey expressed as schema.org, including an image, a jobTitle of "Actor", and a description ("Orangey the Cat is the only feline double-winner of the Patsy Award, the animal kingdom's equivalent of the Oscar. "...).
For completeness, let's look at outgoing properties of Cat too. Let's see well known cat ownership relationships. Try this in http://query.wikidata.org:
SELECT * WHERE {
# ?c 'owned by' ?o, where ?c is a Cat:
?c <http://www.wikidata.org/prop/direct/P127> ?o .
?c <http://www.wikidata.org/prop/direct/P31> <http://www.wikidata.org/entity/Q146> .
?c <http://www.w3.org/2000/01/rdf-schema#label> ?catname .
?o <http://www.w3.org/2000/01/rdf-schema#label> ?ownername.
FILTER(LANG(?ownername) = "en")
FILTER(LANG(?catname) = "en")
}
... you'll find Socks and Bill Clinton; India, owned by George W. and Laura Bush; Humphrey owned by the Cabinet Office etc.
Having got this far, there are a few things yet to investigate:
@danbri, interesting direction to go:
As you suggest, vocabulary range information can also be used to create a CSVW datatype definitions for mapping CSV tables to JSON or RDF with appropriate datatype fidelity.
Note that what's most useful for both CSVW and the JSON-LD context is the property ranges, but inferring a WikiData RDFS definition from SPARQL queries seems pretty useful.
Thanks @danbri and @gkellogg , good 2016 restaring work and results!
I imagine that you are looking for a SPARQL algorithm, that can do automatic recognition of each Wikidata item equivalent to a SchemaOrg item... Well, we can start with some sample of consensual item pairs, to check and/or discuss the behaviour of the proposed algorithms. Examples:
Are these correspondences (1-4) consensual? Each is really an semantic equivalence relationship? How the algorithm will obtain these pairs?
Perhaps we can adapt this basic map algorithm as first tool for query.wikidata.
PS: Wikidata "could look at an external schema.org extension" as showed here and here, but we can also conclude that Wikidata is a good replacement for SchemaOrg :-) I will stay using Schema.
Peter, 24/01/2016 15:42:
Are these correspondences (1-4) consensual?
To find out, add a https://www.wikidata.org/wiki/Property:P1709 statement on the Wikidata property/entity and see what happens!
Hi @nemobis, thanks, can you express your ideia in a query at Wikidata? A query that demonstrate to us that an external concept (ex. SchemaOrg's author) is equivalent to a Wikidata item? (Q482980 in this ex.)... The problem is how to define each external concept in a generic SPARQL query.
About equivalence operator have two meanings here, in this issue (#280),
owl:equivalentClass
or rdfs:subClassOf
for classes and owl:equivalentProperty
or rdfs:subPropertyOf
for properties);extequiv
or extsuper
or extsub
), and your suggestion of use P1709.About my use of the term "consensual", is not about the equivalence operator, is about our "human understanding" and a community agreement (consensus) about understanding of each one (me, you and any other here discussing). Do you agree with my concept matching, at 1-4 listed pairs?
PS: as soon as there are more Wikidata items in the handmade sample set, the greater the difficulty of check or reaching consensus... So, we need to start with good consensus before to close the sample set. An homologated sample set is fundamental to test and discuss any kind of algorithm here.
Oops, is important to distinguish some types of algorithms (approaches)...
FILTER( REGEX(STR(?extequiv), "schema.org") || REGEX(STR(?extsub), "schema.org") || REGEX(STR(?extsuper), "schema.org") )
make use of Wikidata's existing association. We can check and/or audit previous human work registering pairs... As @danbri commented "Wikidata itself can be used as a registry of mappings to/from schema.org terms".
From Lydia Pintscher in https://twitter.com/nightrose/status/558549091844886528
Update 2016-01-26 - since the original post there have been some improvements at both Wikidata and Schema.org:
Nearby