99moorem / wikiteam

Automatically exported from code.google.com/p/wikiteam
0 stars 0 forks source link

Sometimes pages in namespaces with strange aliases will not be found (wikiHow) #24

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
I'm downloading the English wikiHow and the script finds 0 revisions for every 
main space talk. 
For instance: http://www.wikihow.com/Discussion:Make-a-Robot-at-Home «The page 
"Discussion:Make a Robot at Home" was missing in the wiki (probably deleted)».
This is probably because they changed the name of the namespace in some weird 
way, so they always use Discussion: instead of Talk: (although the latter 
redirects to the former: http://www.wikihow.com/Talk:Make-a-Robot-at-Home ) and 
for some reason this won't work with API, although 
http://www.wikihow.com/index.php?title=Special:Export gives you the required 
pages.
wikiHow is known to patch/hack MediaWiki in a lot of improper ways, but I don't 
understand what could cause this problem.

Original issue reported on code.google.com by nemow...@gmail.com on 10 Jul 2011 at 8:30

GoogleCodeExporter commented 8 years ago
Actually I've looked at the xml and there are several talks, so it doesn't 
depend on the namespace. Perhaps the error is misleading, it could even be a 
temporary problem.

Original comment by nemow...@gmail.com on 10 Jul 2011 at 8:54

GoogleCodeExporter commented 8 years ago
Strike that, it doesn't happen with talks only, but with all titles containing 
a space. Perhaps because wikiHow uses dash instead of underscore to replace 
space in URLs?

Original comment by nemow...@gmail.com on 10 Jul 2011 at 9:47

GoogleCodeExporter commented 8 years ago
I tried to archive WikiHow in the past (Spanish one) and I saw some weird 
behavior. Further research is required on this.

Original comment by emi...@gmail.com on 12 Jul 2011 at 4:15

GoogleCodeExporter commented 8 years ago

Original comment by nemow...@gmail.com on 29 Feb 2012 at 11:41

GoogleCodeExporter commented 8 years ago

Original comment by ad...@alphacorp.tk on 22 Jun 2012 at 10:02