kiwix / libkiwix

Common code base for all Kiwix ports
https://download.kiwix.org/release/libkiwix/
GNU General Public License v3.0
117 stars 55 forks source link

Why are tags transformed? #899

Closed rgaudin closed 6 days ago

rgaudin commented 1 year ago

Sorry if the title is misleading, I am talking about the use of humanFriendlyTitle() for the display of tags in the book card.

https://github.com/kiwix/libkiwix/blob/bf80367b5a9f55758947cb481d61543b47bbeca6/static/skin/index.js#L69-L77

Screenshot_2023-02-22_at_10_17_20

I understand it may look better in some situation but given the spec allows ZIM creators anything but the ; character (that includes spaces and uppercase!), if the tags are not satisfying, we should fix them in the scrapers and not transform them on the reader.

Maybe that should raise a discussion about what Tags we should set on all the ZIM we create. We've been very lazy about this.

mgautierfr commented 1 year ago

I can't answer you why we are transforming the tag, there is a lot of people working on it but not me (https://github.com/kiwix/libkiwix/blame/bf80367b5a9f55758947cb481d61543b47bbeca6/static/skin/index.js#L69-L77)

But I can say you why we may want to transform the tag: We may want to store tag in a simple, standardized, searchable form (as lower case, no space, sentence of words separated by _) and in the same time displaying it nicely to the user.

I don't know if it is a good thing. We may (or not) indeed have better to state that tags must be stored as they will be displayed and then have something else to do search and filtering.

rgaudin commented 1 year ago

I can't answer you why we are transforming the tag, there is a lot of people working on it but not me (https://github.com/kiwix/libkiwix/blame/bf80367b5a9f55758947cb481d61543b47bbeca6/static/skin/index.js#L69-L77)

Really like that github blame view!

But I can say you why we may want to transform the tag: We may want to store tag in a simple, standardized, searchable form (as lower case, no space, sentence of words separated by _) and in the same time displaying it nicely to the user.

Kind of what we should do about Categories but categories and tags are semantically different. I don't know of any system that treats tags like this; because it doesn't make much sense. It's not possible to present it nice to the user in a generic way. Even in the context of US English, it's not. Let's say ZIM creator sets TED as tag because it's an acronym. Ted is clearly a bad improvement in this case.

I don't know if it is a good thing. We may (or not) indeed have better to state that tags must be stored as they will be displayed and then have something else to do search and filtering.

I don't think both are incompatible. If you have My little Pony 🦄 ✨ as tag, you should still filter by it and should get the list of ZIMs sharing this tag. That's how all tagging system works.


libkiwix's JS reader currently doesn't support Tags with spaces and there might even be room for injection there but I'll wait for answers on this before opening a ticket.

<span class="tag__link" aria-label="Filter by tag &quot;My little pony 🦄 ✨&quot;" title="Filter by tag &quot;My little pony 🦄 ✨&quot;" data-tag="my" little="" pony="" 🦄="" ✨="" click-listener="true">My little pony 🦄 ✨</span>
kelson42 commented 4 months ago

I requested to update the tags to make them more user friendly. Sounds good at a first place, but @rgaudin might be right. I will revert this piece of code.

kelson42 commented 1 month ago

It seems the bug has not been fixed properly, see https://dev.library.kiwix.org/#lang=eng&q=usa

veloman-yunkan commented 1 week ago

libkiwix's JS reader currently doesn't support Tags with spaces and there might even be room for injection there but I'll wait for answers on this before opening a ticket.

<span class="tag__link" aria-label="Filter by tag &quot;My little pony 🦄 ✨&quot;" title="Filter by tag &quot;My little pony 🦄 ✨&quot;" data-tag="my" little="" pony="" 🦄="" ✨="" click-listener="true">My little pony 🦄 ✨</span>

This has also been addressed in #1121