Open chris-gassner opened 1 year ago
hey Christoph, thanks for the good issue. Yeah - i think you're right about an timeout for some pages. The api plugin loops around and fetches things 500 at a time.
I looked into the python example - the getIncoming
method is only returning pages that are wikipedia articles (namespace 0) and not other wikipedia internal stuff. I think the python discrepency is from User talk pages - haha, people are using this template on their profile pages.
Please let me know if you can track down other cases with missing articles. The Europe case needs some thinking. Maybe we could try lowering the limit down from 500. The code is here if anyone is interested. cheers
I'm trying to fetch incoming links for pages and some docs cause a crash when calling getIncoming().
Trying to fetch incoming links for the article 'Europe' fails with:
while getIncoming() works for 'Javascript' or 'Briefcase' for example. I'm guessing this is probably related to the number incoming links. The europe article has 86,136 direct links according to https://linkcount.toolforge.org/?project=en.wikipedia.org&page=Europe&namespaces= The article Python (programming language) has 9,467 links according to https://linkcount.toolforge.org/?project=en.wikipedia.org&page=Python%20(programming%20language)&namespaces= but I get back 3718 pageids when calling getIncoming.
Not a big deal, just thought I'd let you know though.