notofonts / thai

Noto Thai
SIL Open Font License 1.1
10 stars 1 forks source link

Thai Sample Text on fonts.google.com/noto is Incorrect #20

Closed kagiura closed 2 years ago

kagiura commented 2 years ago

Title

Thai Sample Text on fonts.google.com/noto is Incorrect

Font

Noto Sans Thai, Noto Sans Thai Looped, and Noto Serif Thai

OS name and version / Application name and version

Inapplicable

Issue

Simply put, some of the sample text seems to be inputted improperly, making the Thai incorrect. As a Thai native speaker, the text looks really confusing, especially with the bug in notofonts/thai#8 making the text even more unreadable. I'm not aware if this is the right repository, but I've listed down the mistakes below

  • มนุษย์ -> มนษย
  • ทุกคนมีสิทธิที่จะได้ -> ทุกคนมีสิทธิที่จะได
  • ความสัมพันธ์ฉันมิตรระหว่างชาติต่างๆ -> ความสัมพันธ์ฉันมิตรระหว่างชาติต่าง ๆ (less of a typo and more of a punctuation mistake)

Screenshot

image image

This one also shows notofonts/thai#8 very clearly, and it makes the font really hard to read so, hope that gets fixed as well. Doesn't happen with the other Thai fonts though.

simoncozens commented 2 years ago

I'll work on notofonts/thai#8.

As far as the sample text is concerned, the first two parts of the text you mention ("มนุษย์", "ทุกคนมีสิทธิที่จะได้") are both taken from the Unicode Standard's UDHR documents (https://unicode.org/udhr/d/udhr_tha.html). If those are incorrect, I would prefer to get them fixed there first.

I don't know where "ความสัมพันธ์ฉันมิตรระหว่างชาติต่างๆ" came from. @twardoch, any ideas?

r12a commented 2 years ago

As far as the sample text is concerned, the first two parts of the text you mention ("มนุษย์", "ทุกคนมีสิทธิที่จะได้") are both taken from the Unicode Standard's UDHR documents (https://unicode.org/udhr/d/udhr_tha.html). If those are incorrect, I would prefer to get them fixed there first.

Note that the second phrase occurs multiple times in the udhr page.

kagiura commented 2 years ago

I'll work on notofonts/thai#8.

As far as the sample text is concerned, the first two parts of the text you mention ("มนุษย์", "ทุกคนมีสิทธิที่จะได้") are both taken from the Unicode Standard's UDHR documents (https://unicode.org/udhr/d/udhr_tha.html). If those are incorrect, I would prefer to get them fixed there first.

I don't know where "ความสัมพันธ์ฉันมิตรระหว่างชาติต่างๆ" came from. @twardoch, any ideas?

@simoncozens They indeed are from there, however it's missing the tone marks/vowels somehow? I know this can happen with improper input, eg. sometimes copying Thai text doesn't copy the marks (not sure why it happens but it does 🤔). But the one on the site is definitely different from the ones in the UDHR itself. Here's more comparisons:

image image image image

The last one is from the 16px example, although now that I look at it I also have no idea where the last sentence is from. The one on Google Fonts goes like this

โดยที่เป็นการจำเป็นที่จะส่งเสริมพัฒนาการแห่งความสัมพันธ์ฉันมิตรระหว่างชาติต่างๆ

Whereas the UDHR goes like this

โดยที่ประชากรแห่งสหประชาชาติได้ยืนยันไว้ในกฎบัตรถึงความเชื่อมั่นในสิทธิมนุษยชนอันเป็นหลักมูล

I also can't seem to find the text on the Google Fonts version in the UDHR. Maybe it got updated?

simoncozens commented 2 years ago

Oh, I'm sorry, I read your report backwards - so มนษย is wrong (and what is in the sample) and มนุษย์ is correct, and so on. (I was confused because มนุษย์ is also in the sample...) In that case, this is an easy fix.

simoncozens commented 2 years ago

I think the last one is a word breaking failure in the browser. The text in the sample files is fine.

kagiura commented 2 years ago

Oh, the last one is only a punctuation mistake—the ๆ symbol requires a space in front of it as well.

And also, I found where the text seems to be from. The UDHR has two Thai versions, it seems? The site uses the (presumably) newer version, Thai-2.

https://www.unicode.org/udhr/d/udhr_tha2.html

The source also seems to have this mistake, so I'll have to look at fixing that as well. It's only a minor punctuation issue missing a space though, so I think this can be fixed on Google Fonts for now (although it's a very minor one so, no rush). Thank you again!