webcompat / web-bugs

A place to report bugs on websites.
https://webcompat.com
Mozilla Public License 2.0
747 stars 67 forks source link

treasuregreen.com - see bug description #121737

Open ksy36 opened 1 year ago

ksy36 commented 1 year ago

URL: https://treasuregreen.com/

Browser / Version: Firefox 114.0 Operating System: Mac OS X 10.15 Tested Another Browser: Yes Chrome

Problem type: Something else Description:  charcaters instead of space Steps to Reproduce: Scroll down to the bottom menu and observe the text in About section. (OBJ) character is displayed instead of space

妙品馨茶莊 溫哥華第一家 1981 年

View the screenshot Screenshot
Browser Configuration
  • gfx.webrender.all: true
  • gfx.webrender.blob-images: true
  • gfx.webrender.enabled: false
  • image.mem.shared: true
  • buildID: 20230501093846
  • channel: nightly
  • hasTouchScreen: false
  • mixed active content blocked: false
  • mixed passive content blocked: false
  • tracking content blocked: false

View console log messages

From webcompat.com with ❤️

ksy36 commented 1 year ago

Affected area:

Screen Shot 2023-05-02 at 2 13 33 PM
wisniewskit commented 1 year ago

The [obj] character is being rendered in Arial by Firefox, and Menlo in Chrome/Safari. So this seems like it's probably just Firefox picking a different fallback font which doesn't have the character defined.

wisniewskit commented 1 year ago

@jfkthame, is this something that rings a bell for you?

jfkthame commented 1 year ago

Not specifically, but clearly what's happening is that the site has a spurious U+FFFC "OBJECT REPLACEMENT CHARACTER" in the content there. Probably residue from some kind of word-processing software that was being used to author the text (or maybe the tool a translator was using).

That character isn't really meaningful in the context of web content like this -- that's not how "embedded objects" on a web page are represented. But it's there in the content, and so it gets rendered. What it looks like will depend what glyph happens to be in the font that gets used -- which likely will be a fallback of some kind, as most fonts don't support this character code at all. (Why should they; it's not intended to be rendered, generally.) In Menlo it has a blank glyph; in Arial it has an [OBJ] graphic.

So, not a browser bug -- it's a content error on the site. That character isn't a space, and it's rather misleading for Menlo to render it as one.

wisniewskit commented 1 year ago

@jfkthame I see, thanks! Might it still be better for Firefox to select Menlo as well, for webcompat's sake? Or do you think selecting Arial instead is the better thing to do here?

jfkthame commented 1 year ago

No, I don't think there's any reason to make a change because of this. We might (indeed, we do) make adjustments to fallback behavior from time to time, aiming to improve the behavior for "real" content, but this is not a case that justifies trying to customize it.

This is similar to the case of other "random" control characters in content, which sometimes happens as an artifact of poor text preparation workflows or transcoding errors, and the CSS WG has agreed in past discussions that such things should be made visible, because hiding them does users a disservice -- it encourages the propagation of content that doesn't actually work as intended.

E.g. searching for 妙品馨茶莊 溫哥華第一家 1981 年 on the treasuregreen site, which is what Chrome makes it look like the text says, or for 妙品馨茶莊 溫哥華第一家 1981 年, which is what Safari makes it look like, will fail to find anything because of the spurious control characters.

Unfortunately, implementation of making-spurious-chars-visible is inconsistent/incomplete, so things like this still happen, where content "looks right" in a browser where the garbage happens to be invisible, and "looks broken" in a browser where it's visible.

wisniewskit commented 1 year ago

Ok, that's very useful context, thanks again. Let's see if we can make a sitepatch to prefer Menlo as a fallback font, and contact the site.