tongwentang / tongwentang-extension

MIT License
132 stars 15 forks source link

Update HTML lang attr on convert #60

Closed maxmellen closed 1 year ago

maxmellen commented 2 years ago

Hello 新同文堂 maintainers ✨

I've been focusing on learning traditional Chinese characters lately, and I've had a great time using the automatic conversion to view sites that don't support 正體中文 in traditional characters.

Issue 🪲

The only issue I've encountered is that the html tag's lang attribute doesn't get updated which in a lot of cases causes the browser to dislpay the characters following the 大陸新字型.

Proposed solution 🔧

This PR adds a function called in convertNode to update the html element's lang attribute according to the target LangType. (only if lang is already a valid Chinese language tag)

I'm happy to answer any question or move any code to a better location if needed 🙇‍♀️

Screenshots 🖼️

Here's an example on cn.vuejs.org:

Please pay attention to the rendering of 進's 走之底 (「⻌」比「⻎」) and 架's lower half (「木」比「朩」), as well as the centered punctuation.

lang="zh-Hans" 简体:

cn vuejs org zh-Hans 简体

lang="zh-Hans" 繁體 (差不多正體) without this PR:

cn vuejs org zh-Hans 繁體

lang="zh-Hant" 繁體 (確切正體) with this PR:

cn vuejs org zh-Hant 繁體
t7yang commented 2 years ago

hi @maxmellen

this is a nice improvement 👍 I didn't notice this before, probably because I forced all Chinese characters to be displayed in Taiwanese font.

I think we should change the lang attribute more carefully some website may rely on the attribute to do something also, you can force the extension to convert a non Chinese website and the extension changes the lang attribute to zh-* but that's not what the website is supposed to represent.

I think we can introduce a new option to let the user choose whether or not to update the lang attribute on convert. also, we need to ensure changing the lang attribute only if

  1. the website already has a lang attribute and the attribute is one of the valid zh-* value.
  2. the lang attribute is not set, and the browser recognizes the website as a Chinese website.

anyway, I really appreciate it. feel free to submit more code if you like to implement this feature. otherwise, I will open an issue to keep track it down and do it when I am free.

maxmellen commented 2 years ago

Hi @t7yang!

Thanks a lot for the feedback 🙇

Regarding this point:

  1. the website already has a lang attribute and the attribute is one of the valid zh-* value.

There is already a check to only update the lang attribute of pages with a valid Chinese language tag: see this file.

I've just pushed a commit to only update the lang attribute if the user has enabled this in their options (this is disabled by default).

Regarding this point:

  1. the lang attribute is not set, and the browser recognizes the website as a Chinese website.

I'm not sure how to do this. Maybe you can give me some guidance? I think it would also be fine to skip this for now and just not set a lang attribute if one is not already set, as is the current behavior of this branch.

Let me know what you think :)

maxmellen commented 2 years ago

Hello again!

Any update on this? Happy to update my code if my solution is inadequate. Would appreciate any feedback.

Thanks :)

t7yang commented 2 years ago

sorry, I was busy last week, I will review your code this week, maybe weekend.

maxmellen commented 1 year ago

Hi @t7yang,

Thanks a lot for your comments! 🙇‍♀️ I’ve updated my branch to address all of your points (see: 44841ddb49191f607bc8a5d2e9018a7e1f67a16a)

Let me know if there’s anything else! Hope this can be merged soon 🤞

maxmellen commented 1 year ago

Hi @t7yang 👋

I was wondering if there's any chance to see this released any time soon? Or are you waiting for anything in particular before releasing?

Thanks in advance!

t7yang commented 1 year ago

yeah, I am busy on other OSS projects but I just published the new version on all three browsers so, stay tuned

maxmellen commented 1 year ago

Just got the update on Firefox, great to see this published 💫 Thanks again 👏

t7yang commented 1 year ago

@maxmellen btw, I published it even I found a issue, but I know you really want this feature. I found the Wikipedia doesn't applied the correct font, even the extension changed the lang attribute. if you have time, you can take a look at this issue, PR or further discussion is always welcome :)

maxmellen commented 1 year ago

Hey @t7yang 👋

So I had some time to look into this, and on Wikipedia, the html elements's lang attribute is indeed updated properly, but as you can see in the screenshot below, basically every element that holds text such as the main content, but also the nav bars and menus have their own lang attributes which don't get updated currently.

I've opened a PR that updates the lang attribute on every element that already has a Chinese lang attribute → https://github.com/tongwentang/tongwentang-extension/pull/64

Let me know what you think :)

image