ipfs / distributed-wikipedia-mirror

Putting Wikipedia Snapshots on IPFS
https://github.com/ipfs/distributed-wikipedia-mirror#readme
632 stars 54 forks source link

Switch to zimdump from zim-tools #66

Closed lidel closed 3 years ago

lidel commented 5 years ago

Motivation

Creating a new snapshot requires unpacking data from ZIM archive.

Legacy process relied on a customized extract_zim tool which unfortunately is no longer able to unpack latest snapshots (https://github.com/ipfs/distributed-wikipedia-mirror/issues/60#issuecomment-546905445).

Good news: we now have upstream openzim/zim-tools which not only unpacks archives without a problem, but removes maintenance burden from the mirror project

Prerequisites

@kelson42 I took a look at output of zimdump v1.0.5 and believe we could switch to this tool when below issues are addressed:

Nice-to-haves

Not blockers, but things to consider in the future:

kelson42 commented 5 years ago

@lidel Thank you for having explained and detailed which problem we face here. We will do our best to fix the 4 tickets you have reported before the end of next January.

I have created two tickets to get binaries for: