dbpedia / extraction-framework

The software used to extract structured data from Wikipedia
853 stars 269 forks source link

dbpedia should use true wikidata URLs #293

Closed VladimirAlexiev closed 9 years ago

VladimirAlexiev commented 9 years ago

DBpedia has sameAs links to Wikidata. However, the Wikidata URLs are mangled.

Eg http://dbpedia.org/page/Fredericksburg_Hotspur has owl:sameAs to http://wikidata.dbpedia.org/resource/Q5499200 (which does not resolve).

The correct URL is http://www.wikidata.org/entity/Q5499200.

VladimirAlexiev commented 9 years ago

There are some wikidata statements in dpbedia, eg see a describe. But they should be against the true wikidata URL

redaktor commented 9 years ago

Yep, just reported this via mail. Continuing here... Below you can find a .json with the external dependencies in the DBpedia Ontology. The null value means that the URI could not be resolved by a machine.

As @VladimirAlexiev said http://wikidata.dbpedia.org/resource/{ID} is 404 / unknown host But it should become the machine readable representation http://dbpedia.org/ontology/Wikidata:{ID}

And I think http://xmlns.com/foaf/0.1/{ID} should become http://xmlns.com/foaf/spec/index.rdf#{ID}

And I wonder if http://purl.org/ontology/bibo/{ID} shouldn't be http://purl.org/ontology/bibo#{ID} But that is unclear to me - I'm coming from a .js/JSON world. A JSON reference pointer would mean {baseURI}#{ID} ... In RDF can this be {baseURI}/{ID} as well ?

The .json with the external dependencies in the DBpedia Ontology :

{   
    "http://wikidata.dbpedia.org/resource/P102": null,
    "http://wikidata.dbpedia.org/resource/P106": null,
    "http://wikidata.dbpedia.org/resource/P108": null,
    "http://wikidata.dbpedia.org/resource/P109": null,
    "http://wikidata.dbpedia.org/resource/P1215": null,
    "http://wikidata.dbpedia.org/resource/P131": null,
    "http://wikidata.dbpedia.org/resource/P140": null,
    "http://wikidata.dbpedia.org/resource/P149": null,
    "http://wikidata.dbpedia.org/resource/P155": null,
    "http://wikidata.dbpedia.org/resource/P156": null,
    "http://wikidata.dbpedia.org/resource/P157": null,
    "http://wikidata.dbpedia.org/resource/P161": null,
    "http://wikidata.dbpedia.org/resource/P166": null,
    "http://wikidata.dbpedia.org/resource/P17": null,
    "http://wikidata.dbpedia.org/resource/P172": null,
    "http://wikidata.dbpedia.org/resource/P175": null,
    "http://wikidata.dbpedia.org/resource/P19": null,
    "http://wikidata.dbpedia.org/resource/P20": null,
    "http://wikidata.dbpedia.org/resource/P21": null,
    "http://wikidata.dbpedia.org/resource/P227": null,
    "http://wikidata.dbpedia.org/resource/P229": null,
    "http://wikidata.dbpedia.org/resource/P230": null,
    "http://wikidata.dbpedia.org/resource/P238": null,
    "http://wikidata.dbpedia.org/resource/P239": null,
    "http://wikidata.dbpedia.org/resource/P263": null,
    "http://wikidata.dbpedia.org/resource/P264": null,
    "http://wikidata.dbpedia.org/resource/P27": null,
    "http://wikidata.dbpedia.org/resource/P31": null,
    "http://wikidata.dbpedia.org/resource/P345": null,
    "http://wikidata.dbpedia.org/resource/P364": null,
    "http://wikidata.dbpedia.org/resource/P41": null,
    "http://wikidata.dbpedia.org/resource/P473": null,
    "http://wikidata.dbpedia.org/resource/P50": null,
    "http://wikidata.dbpedia.org/resource/P509": null,
    "http://wikidata.dbpedia.org/resource/P513": null,
    "http://wikidata.dbpedia.org/resource/P54": null,
    "http://wikidata.dbpedia.org/resource/P569": null,
    "http://wikidata.dbpedia.org/resource/P57": null,
    "http://wikidata.dbpedia.org/resource/P570": null,
    "http://wikidata.dbpedia.org/resource/P575": null,
    "http://wikidata.dbpedia.org/resource/P61": null,
    "http://wikidata.dbpedia.org/resource/P638": null,
    "http://wikidata.dbpedia.org/resource/P69": null,
    "http://wikidata.dbpedia.org/resource/P70": null,
    "http://wikidata.dbpedia.org/resource/P71": null,
    "http://wikidata.dbpedia.org/resource/P74": null,
    "http://wikidata.dbpedia.org/resource/P75": null,
    "http://wikidata.dbpedia.org/resource/P77": null,
    "http://wikidata.dbpedia.org/resource/P84": null,
    "http://wikidata.dbpedia.org/resource/P85": null,
    "http://wikidata.dbpedia.org/resource/P91": null,
    "http://wikidata.dbpedia.org/resource/P94": null,
    "http://wikidata.dbpedia.org/resource/Q215627": null,
    "http://wikidata.dbpedia.org/resource/Q482994": null,
    "http://wikidata.dbpedia.org/resource/Q5": null,
    "http://wikidata.dbpedia.org/resource/Q532": null,

    "http://xmlns.com/foaf/0.1/Document": null,
    "http://xmlns.com/foaf/0.1/Image": null,
    "http://xmlns.com/foaf/0.1/Person": null,

    "http://purl.org/ontology/bibo/Article": null,
    "http://purl.org/ontology/bibo/Book": null,
    "http://purl.org/ontology/bibo/Note": null, 

    "http://purl.org/NET/cidoc-crm/core": ["E4_Period"],
    "http://schema.org": [
        "CollegeOrUniversity",
        "Festival",
        "Park",
        "TVEpisode",
        "MusicGroup",
        "Product",
        "Event",
        "Organization",
        "Country",
        "Airport",
        "Hospital",
        "RadioStation",
        "SeaBodyOfWater",
        "School",
        "Comment",
        "GovernmentOrganization",
        "StadiumOrArena",
        "Museum",
        "Movie",
        "SkiResort",
        "Canal",
        "Hotel",
        "Library",
        "MusicRecording",
        "MusicAlbum",
        "Sculpture",
        "CreativeWork",
        "Book",
        "Language",
        "Restaurant",
        "Person",
        "City",
        "ShoppingCenter",
        "Painting",
        "BodyOfWater",
        "LandmarksOrHistoricalBuildings",
        "EducationalOrganization",
        "SportsEvent",
        "SportsTeam",
        "Mountain",
        "WebPage",
        "Continent",
        "RiverBodyOfWater",
        "Place",
        "BankOrCreditUnion",
        "TelevisionStation",
        "LakeBodyOfWater",
        "AdministrativeArea",
        "illustrator",
        "relatedTo",
        "author",
        "bookFormat",
        "founders",
        "genre",
        "actors",
        "maps",
        "musicBy",
        "spouse",
        "awards",
        "children",
        "inLanguage",
        "branchOf",
        "byArtist",
        "publisher",
        "director",
        "nationality",
        "image",
        "containedIn",
        "producer",
        "duration",
        "endDate",
        "deathDate",
        "numberOfPages",
        "birthDate",
        "episodeNumber",
        "isbn",
        "numberOfEpisodes",
        "startDate"
    ],

    "http://www.ontologydesignpatterns.org/ont/d0.owl": ["Location", "CognitiveEntity", "Activity"],
    "http://www.ontologydesignpatterns.org/ont/dul/DUL.owl": [
        "DesignedArtifact",
        "BiologicalObject",
        "InformationEntity",
        "Description",
        "FunctionalSubstance",
        "Organism",
        "Concept",
        "NaturalPerson",
        "Situation",
        "Event",
        "SocialPerson",
        "PhysicalBody",
        "Configuration",
        "InformationObject",
        "Role",
        "Collective",
        "TimeInterval",
        "Quality",
        "UnitOfMeasure",
        "PlanExecution",
        "Agent",
        "Collection",
        "Entity",
        "ChemicalObject",
        "SpaceRegion",
        "hasPart",
        "hasParticipant",
        "coparticipatesWith",
        "hasLocation",
        "isLocationOf",
        "sameSettingAs",
        "specializes",
        "isParticipantIn",
        "isSettingFor",
        "isPartOf",
        "isClassifiedBy",
        "overlaps",
        "isMemberOf",
        "hasCommonBoundary",
        "hasQuality",
        "isDescribedBy",
        "unifies",
        "conceptualizes",
        "isExpressedBy",
        "nearTo",
        "follows",
        "hasMember",
        "hasSetting",
        "precedes",
        "hasRegion",
        "hasComponent",
        "hasRole",
        "hasConstituent",
        "isSpecializedBy",
        "associatedWith",
        "isAbout",
        "isRoleOf",
        "concretelyExpresses"
    ],
    "http://www.w3.org/2004/02/skos/core": ["OrderedCollection"]
}
jcsahnwaldt commented 9 years ago

@redaktor , the FOAF and Bibo URIs are correct. RDF is not JSON. Let's focus on the DBpedia Wikidata URIs.

Maybe some of the current DBpedia developers can explain the rationale of the http://wikidata.dbpedia.org/resource/{ID} URIs? Or if this has been discussed before, point to the discussion.

I don't have time to explain the details, but the pattern http://dbpedia.org/ontology/Wikidata:{ID} wouldn't be a good choice.

redaktor commented 9 years ago

@jcsahnwaldt What I wanted to say is that for example http://xmlns.com/foaf/0.1/Person basically does not exist. For me it redirects to the foaf specification.
While http://xmlns.com/foaf/spec/index.rdf#Person is the correct URI pointing to a machine readable resource.

mgns commented 9 years ago

http://xmlns.com/foaf/0.1/ is the official (commonly used and preferred) namespace of the FOAF vocabulary (see also http://lov.okfn.org/dataset/lov/vocabs/foaf), while http://xmlns.com/foaf/spec/index.rdf is the document describing this vocabulary. So these namespaces are perfectly fine.

jimkont commented 9 years ago

Most of language editions of the 2014 release exist only in dumps as well as some of the owl:sameAs links to http://{lang}. dbpedia.org/resource/{title} do not yet dereference

For the wikidata related resources there is no dump yet but will be in the next release. There is some offline discussion before and during the last DBpedia meeting in Dublin on this. I agree with @jcsahnwaldt https://github.com/jcsahnwaldt that the http://dbpedia.org/ontology/Wikidata:{ID} pattern is wrong

VladimirAlexiev commented 9 years ago

@alismayilov, @jimkont :

Minting URLs in the dbpedia namespace based on wikidata id's doesn't sound right to me...

VladimirAlexiev commented 9 years ago

A bit related to #334

jimkont commented 9 years ago

fixed ontology wikidata URIs in https://github.com/dbpedia/extraction-framework/commit/398bd582b3d2233531c8f3cb94270208392b0910