idio / json-wikipedia

Json Wikipedia, contains code to convert the Wikipedia xml dump into a json dump. Questions? https://gitter.im/idio-opensource/Lobby
17 stars 2 forks source link

"Wikipedia:" namespace has wrong ids #18

Closed keynmol closed 9 years ago

keynmol commented 9 years ago

This is not a bug in json-wikipedia, it's a bug in XML data.

Pages like this are then treated as regular articles and whatever happens after that is completely shafted.

We should manually filter them out.

keynmol commented 9 years ago

This seems to be an error in old version of bliki, which is addressed in #11, will close this after we confirm that #11 doesn't change the data too much.

keynmol commented 9 years ago

It's confirmed - #11 fixes it.