infolab-csail / wikipedia-mirror

Makefiles that will download and setup a local wikipedia instance.
1 stars 2 forks source link

See if wp-mirror can be made lighter #19

Closed fakedrake closed 8 years ago

fakedrake commented 8 years ago

WP-MIRROR makes a mirror of a wikimedia site. It is written in common-lisp so that's refreshing, however:

The en wikipedia is the most demanding case. It should build in 1Ms (twelve days), occupy 3T of disk space, be served locally by a virtual host http://en.wikipedia.site/, and update automatically every month.

Removing the images and history should make it much better (wikipedia-mirror takes up ~250G).

At the very least we could replace the mwdumper with mediawiki-mwxml2sql here :(

NOTE: This is not the recommended method of importing XML dumps.

This probably renders wp-mirror unreliable.