Closed fnielsen closed 4 years ago
For some topics like Parkinson's disease, the graph now stalls completely.
I tried to curb that by modifying the query such that it only exposes the most salient co-authorship connections, but in the current version, this times out:
#defaultView:Graph
SELECT ?author1 ?author1Label ?rgb ?author2 ?author2Label
WITH {
# Find works with the topic
SELECT ?work WHERE {
?work wdt:P921 / (wdt:P31* / wdt:P279* | wdt:P361+ | wdt:P1269+) wd:Q11085 .
}
} AS %works
WITH {
# Limit the number of authors
SELECT (COUNT(?work) AS ?count1) ?author1 WHERE {
INCLUDE %works
?work wdt:P50 ?author1 .
}
GROUP BY ?author1
ORDER BY DESC(?count1)
LIMIT 10
} AS %authors1
WITH {
# Limit the number of authors
SELECT (COUNT(?work) AS ?count2) ?author2 WHERE {
INCLUDE %works
INCLUDE %authors1
?work wdt:P50 ?author1 , ?author2 .
FILTER (?author1 != ?author2)
}
GROUP BY ?author2
ORDER BY DESC(?count2)
LIMIT 10
} AS %authors2
WHERE {
INCLUDE %works
INCLUDE %authors1
INCLUDE %authors2
OPTIONAL { ?author1 wdt:P21 ?gender1 . }
BIND( IF(?gender1 = wd:Q6581097, "3182BD", "E6550D") AS ?rgb)
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en,fr,de,ru,es,zh,jp".
}
}
I slightly modified the above query (mainly dropping the INCLUDE %works
line), and it now seems to work also for busy topics. I played with the LIMITs a bit and think the combination 25 for author1 and 250 for author2 works fine.
So here is the modified version:
#defaultView:Graph
SELECT ?author1 ?author1Label ?rgb ?author2 ?author2Label
WITH {
# Find works with the topic
SELECT ?work WHERE {
?work wdt:P921 / (wdt:P31* / wdt:P279* | wdt:P361+ | wdt:P1269+) wd:Q11085 .
}
} AS %works
WITH {
# Limit the number of authors
SELECT (COUNT(?work) AS ?count1) ?author1 WHERE {
INCLUDE %works
?work wdt:P50 ?author1 .
}
GROUP BY ?author1
ORDER BY DESC(?count1)
LIMIT 25
} AS %authors1
WITH {
# Limit the number of coauthors
SELECT DISTINCT ?author2 ?author1 (COUNT(?work) AS ?count2) WHERE {
INCLUDE %works
INCLUDE %authors1
?work wdt:P50 ?author1 , ?author2 .
FILTER (?author1 != ?author2)
}
GROUP BY ?author2 ?author1
ORDER BY DESC(?count2)
LIMIT 250
} AS %authors2
WHERE {
INCLUDE %authors2
OPTIONAL { ?author1 wdt:P21 ?gender1 . }
BIND( IF(?gender1 = wd:Q6581097, "3182BD", "E6550D") AS ?rgb)
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en,fr,de,ru,es,zh,jp".
}
}
The above commit had an encoding problem, which was fixed with https://github.com/fnielsen/scholia/commit/bb2ee219a259c3dc3ce9d7b4c3acf572b047e600#diff-04a29c3a9ff21a4d023f710f69057842 . Looks good to me now.
Deployed now. Closing.
Co-author graph in the topic aspect is slow, see, e.g., https://tools.wmflabs.org/scholia/topic/Q311383
This problem has also occurred in other aspects, see, e.g., #533