Mattschillinger / wikiteam

Automatically exported from code.google.com/p/wikiteam
0 stars 0 forks source link

dumpgenerator.py loop listing page titles with API: use apcontinue #100

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
2882 federico   1   0 1196m 1.1g 2460 S    0  0.8   8:29.76 python 
dumpgenerator.py --api=http://cunnan.sca.org.au/api.php --xml --images

This wiki has been stuck to 1.1 GB memory usage for days now (stalled, now 
interrupted). Whatever that memory is being used for, it should be all RES 
memory, why isn't it being swapped?

In this specific case it must be some loop because the page only has 13k pages 
http://cunnan.sca.org.au/wiki/Special:Statistics but I had the screen full of 
dots.

Original issue reported on code.google.com by nemow...@gmail.com on 6 Feb 2014 at 9:30

GoogleCodeExporter commented 8 years ago

Original comment by nemow...@gmail.com on 6 Feb 2014 at 9:30

GoogleCodeExporter commented 8 years ago
I'm getting a loop for http://wiki.ubuntu.org.cn/Special:Statistics too, it's 
not the first time but I couldn't find a previous report.

Original comment by nemow...@gmail.com on 6 Feb 2014 at 9:35

GoogleCodeExporter commented 8 years ago
The list of wikis on which I identified this issue:

http://wiki.ubuntu.com.cn/api.php
http://cunnan.sca.org.au/api.php
http://www.isyp.org/wiki/api.php
http://telarapedia.gamepedia.com/api.php
http://s23.org/w/api.php
https://www.s23.org/w/api.php
http://www.cdwiki.de/api.php
http://www.haunted-memories.net/w/api.php

It doesn't depend on webserver, version or configuration because it only 
happens with one gamepedia. In the last wiki, it happens on namespace 100, 
manually identified culprit being 
<http://www.haunted-memories.net/w/api.php?action=query&apfrom=PrP:_Fist_and_the
_Eye_%28Forest_for_the_Tree%29&list=allpages&apnamespace=100&format=xml&aplimit=
500>. As you can see, we're given as continue suggestion the title we asked to 
start from; it works if we use apcontinue instead (as we're supposed to do, 
sure). 
<http://www.haunted-memories.net/w/api.php?action=query&apcontinue=PrP:_Fist_and
_the_Eye_%28Forest_for_the_Tree%29&list=allpages&apnamespace=100&format=xml&apli
mit=500>
https://www.mediawiki.org/wiki/API:Allpages doesn't tell if this is expected or 
not, we can only surrender and use apcontinue when the API tells us to. 
https://www.mediawiki.org/wiki/API:Query#Continuing_queries 
https://www.mediawiki.org/wiki/API:Legacy_Query_Continue

Original comment by nemow...@gmail.com on 16 Feb 2014 at 1:08