Open zimwikiuser opened 4 years ago
I second this issue. a very important issue indeed. I was going to create an issue about these faults in Wikipedia zims but I found your issue so just seconding this issue here. I would just like to emphasize that number 1 and 2 in your comment are very important indeed. for example, I once used the template to understand the branching of a major artery which couldn't be found in any textbook. similarly table of contents is a must-have - just imagine reading a textbook without table of contents!
a note: make a separate issue for Wiktionary and leave this issue only for Wikipedia. it's better if you omit Wiktionary issue from here, it just causes distraction form major problem at hand.
@zimwikiuser If you want your ticket treated efficiently, better open one ticket per problem.
Wikipedia english dumps are stripped of too much useful data to access the article itself and related articles, robbing us end-users of highly useful means to rapidly access information. The few gb the stripped metadata add should not matter in a Maxi dump, nor should processing power.
Specifically:
1.) The article's Contents, the index table to quickly jump to any subsection has been stripped for too long, since kiwix 2016 or earlier.
2.) The article's templates are now also missing in the 2020 wikipedia maxi dumps. https://en.wikipedia.org/wiki/Template:Black_holes Still present in 2018 10's maxi dump, called novid back then, the category overview tables are highly useful to get an overview of available articles, the scope of a matter, and often offered very surprising insights just by being present.
3.) The current css served does not match the actual wikipedia layout, that begins with 1.) Index and ends with 2.) Templates, but also the hidable subsection display served is odd and has display issues on desktop browsers. Namely, the headers swallow/cover preceeding text. This can be worked around. but is still suboptimal.
Please include at least the templates in future wikipedia english dumps. Returning the contents of an article is equally preferred, but can be worked around with a userscript. Same for the non-standard css.
Wikitionary.
The Thesaurus: pages are missing. Considering the wikitionary english now reaches 7gb, adding the 100mb or so of thesaurus links should not matter and improve usability of wikitionary dumps a lot.
Please consider to instead omit the translations, ie, as english wikitionary user I have little use for russian, chinese, really any non-english non-latin language definition which often even lack english translations.
Please also consider supplying means to convert a wikipedia dump to zim. zimbuilder docker images are sub par in comparison to freely choosable vm images runable with anything from vmware to virtualbox and so forth.
Man, I couldn't agree more! The lack of templates hurts so much! Thanks to your comment, I just found out that there actually used to be templates.
I always thought it had something to do with the way Wikipedia was built and that templates just weren't supported, but now that I know they actually are, why are there no templates in the latest English Wikipedia Maxi 2021-03 build?!
I mean, the point of this whole project is to make Wikipedia accessible to anyone around the world, from anywhere, without being connected to the internet (that idea alone deserves an Oscar), so we should make it accessible in the best way we can. Templates make it much easier to get a better understanding of what you're reading, they give you a perfect overview, as the guy explained above.
Anything that contributes to an objectively better use of Kiwix should be implemented. Templates are an important example. Because if you think about it, by implementing templates you are furthering the very purpose of this project: by making it easier for people to use Kiwix in the best way possible. Children will also have an easier time using it.
To the scraper team: PLEASE consider including templates in the next English Wikipedia build!
@kelson42 @Popolechien what's the update please for this request?
I don't remember seeing this issue (or anything related) flagged in 1.14, but @kelson42 should know more.
Wikipedia english dumps are stripped of too much useful data to access the article itself and related articles, robbing us end-users of highly useful means to rapidly access information. The few gb the stripped metadata add should not matter in a Maxi dump, nor should processing power.
Specifically:
1.) The article's Contents, the index table to quickly jump to any subsection has been stripped for too long, since kiwix 2016 or earlier.
2.) The article's templates are now also missing in the 2020 wikipedia maxi dumps. https://en.wikipedia.org/wiki/Template:Black_holes Still present in 2018 10's maxi dump, called novid back then, the category overview tables are highly useful to get an overview of available articles, the scope of a matter, and often offered very surprising insights just by being present.
3.) The current css served does not match the actual wikipedia layout, that begins with 1.) Index and ends with 2.) Templates, but also the hidable subsection display served is odd and has display issues on desktop browsers. Namely, the headers swallow/cover preceeding text. This can be worked around. but is still suboptimal.
Please include at least the templates in future wikipedia english dumps. Returning the contents of an article is equally preferred, but can be worked around with a userscript. Same for the non-standard css.
Wikitionary.
The Thesaurus: pages are missing. Considering the wikitionary english now reaches 7gb, adding the 100mb or so of thesaurus links should not matter and improve usability of wikitionary dumps a lot.
Please consider to instead omit the translations, ie, as english wikitionary user I have little use for russian, chinese, really any non-english non-latin language definition which often even lack english translations.
Please also consider supplying means to convert a wikipedia dump to zim. zimbuilder docker images are sub par in comparison to freely choosable vm images runable with anything from vmware to virtualbox and so forth.