Open GoogleCodeExporter opened 8 years ago
Yep. I have thought about this before, to speed up the script (but it will need
more kb/s, and it may not be nice with the server), and decrease server
requests.
But there is a chance of errors with the Special:Export limit revisions. So,
this is a dangerous option that I would like to test so much before releasing.
Original comment by emi...@gmail.com
on 9 Jul 2011 at 7:01
I'm available for testing. :-)
This is probably best used with API only. In that case, I think it could even
decrease server load because it reduces requests: sometimes people arrive on
#wikimedia-tech to ask how to crawl WMF sites without problems and IIRC they're
even suggested to download pages in batches of 50.
A bandwidth throttle may be useful but I don't know if it's possible to control
this aspect.
Original comment by nemow...@gmail.com
on 9 Jul 2011 at 7:29
This would also require Issue 8 (downloads are buffered completely to memory
before writing to disk) to be resolved, as this would definitely result in
larger downloads than before.
Original comment by griffin....@gmail.com
on 9 Jul 2011 at 8:04
I'm not sure, after all the memory consumption is mostly almost negligible
right now.
If you put some limit to number of revisions you shouldn't have any problem;
and again, I'm ready to test memory consumption and do crash-tests as well. :-p
Original comment by nemow...@gmail.com
on 9 Jul 2011 at 9:47
Original comment by nemow...@gmail.com
on 29 Feb 2012 at 11:27
Original issue reported on code.google.com by
nemow...@gmail.com
on 9 Jul 2011 at 3:50