speedydeletion / wikiproc

processing tools for the wiki
0 stars 0 forks source link

Extract articles from database dump #2

Open h4ck3rm1k3 opened 6 years ago

h4ck3rm1k3 commented 6 years ago

Given a list of articles and a url to a compressed database dump, the tool will extract those articles out of the stream without having to save the entire file to disk.

h4ck3rm1k3 commented 6 years ago

There is an incremental backup of the english wikipedia here https://archive.org/details/incr-enwiki-20180412

h4ck3rm1k3 commented 6 years ago

https://dumps.wikimedia.org/enwiki/20180401/ here is the latest dump