o0111 / ruralcafe

Automatically exported from code.google.com/p/ruralcafe
0 stars 0 forks source link

Wiki dump improvements #37

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
1. Searching for wiki pages, that are in the dump, does not seem to work.

2. For some pages only a redirect page is found. E.g. when you enter 
http://en.wikipedia.org/wiki/Germany it finds "GerMany" which only redirects to 
"Germany", but "Germany" itself is not found.

3. The html for the dump pages is awful. There is no formatting and code pieces 
that clearly shouldn't be, are visible. MzReader (builds upon BzReader) could 
be an option.

Original issue reported on code.google.com by satiaher...@gmx.de on 6 Jul 2013 at 10:17

GoogleCodeExporter commented 8 years ago
1. and 2. fixed.

4. We don't have content snippets for BzReader search results. As BzReader 
itself does not support this, I don't know if this can be changed.

Original comment by satiaher...@gmx.de on 8 Jul 2013 at 8:59

GoogleCodeExporter commented 8 years ago
5. Offer to index multiple wiki dumps.

6. Offer to index wiki dumps via RC without installing BzReader additionally.

Original comment by satiaher...@gmx.de on 8 Jul 2013 at 6:31