nichtich / wikidata-taxonomy

command-line tool to extract taxonomies from Wikidata
https://www.npmjs.org/package/wikidata-taxonomy
MIT License
124 stars 11 forks source link

Sparql request failed. #15

Closed KnowledgeGarden closed 7 years ago

KnowledgeGarden commented 7 years ago

Simple message after this commandline:

 node wdtaxonomy.js Q35120 --format json

When I ran it with --sparql, I got this query:

SELECT ?item ?itemLabel ?broader ?parents ?instances ?sites
WHERE {
    {
        SELECT ?item (count(distinct ?parent) as ?parents) {
            ?item wdt:P279* wd:Q35120
            OPTIONAL { ?item wdt:P279 ?parent }
        } GROUP BY ?item
    }
    {
        SELECT ?item (count(distinct ?element) as ?instances) {
            ?item wdt:P279* wd:Q35120
            OPTIONAL { ?element wdt:P31 ?item }
        } GROUP BY ?item
    }
    {
        SELECT ?item (count(distinct ?site) as ?sites) {
            ?item wdt:P279* wd:Q35120
            OPTIONAL { ?site schema:about ?item }
        } GROUP BY ?item
    }
    OPTIONAL { ?item wdt:P279 ?broader }
    SERVICE wikibase:label {
        bd:serviceParam wikibase:language "en" .
    }

and then it stopped. Not sure what I am doing wrong. The chosen root is "entity" which appears to be at least one important root (perhaps the root?) at Wikidata

nichtich commented 7 years ago

The taxonomy starting from Q35120 is too large, so the SPARQL request will fail with a timeout. Try option -c to only get the first level:

$ wdtaxonomy.js -c Q35120
entity (Q35120) •37 ×4
├──space (Q107) •84 ×10
├──energy (Q11379) •152 ×4
├──matter (Q35758) •112 ×1
├──sponsor (Q152478) •38 ×1
├──being (Q203872) •52
├──Dasein (Q404130) •20
├──object (Q488383) •38 ×2
├──problem (Q621184) •62 ×5
├──individual (Q795052) •53
├──subject (Q830077) •29
├──disposition (Q1149305) •11
├──resource (Q1554231) •27 ×2
├──cause (Q2574811) •2 ×2
├──information source (Q3523102) •10
├──sensation set (Q3955369) •15 ×1
├──cellular component (Q5058355) •1 ×1
├──class (Q5127848) •2 ×1
├──part (Q15989253) •1 ×107 ↑
│  ╘══cellular component (Q5058355) •1 ×1 …
├──perception set (Q16354711)
├──phenomenon (Q16722960) •1 ×1
├──object (Q17553950) •4 ×4
├──group (Q18844919) •1 ×81
├──information request (Q23312670)
├──isolation (Q23498130)
├──spacetime volume (Q23956024) ×1
├──individual (Q23958946)
├──assumed entity (Q24199478) ×1
├──agent (Q24229398) ×1
├──Norse entity (Q24439885)
└──temporal entity (Q26907166)

Unfortunately there is no option to limit the taxonomy to a given number of levels because this is complex to do in SPARQL (#16).