mpv-player / mpv

🎥 Command line video player
https://mpv.io
Other
28.51k stars 2.92k forks source link

Rendering fails for characters with multiple code points. #13322

Closed ahaoboy closed 5 months ago

ahaoboy commented 9 months ago

Important Information

[osd/libass] fontselect: failed to find any fallback with glyph 0xF0411 for font: (FiraCode Nerd Font Reg, 400, 0)

mpv v0.37.0-94-g3250f6e4 Copyright © 2000-2023 mpv/MPlayer/mplayer2 projects built on Dec 14 2023 00:31:21 libplacebo version: v6.338.0-62-g52314e0-dirty FFmpeg version: N-112969-g5475f665f FFmpeg library versions: libavutil 58.33.100 libavcodec 60.35.100 libavformat 60.18.100 libswscale 7.6.100 libavfilter 9.14.100 libswresample 4.13.100

Reproduction steps

var ovl = mp.create_osd_overlay("ass-events")
var Random = ""
var PlaylistPlay = "󰐑"

print(Random.length)  // 1
print(PlaylistPlay.length) // 2

ovl.data = "{\\fnFiraCode Nerd Font Reg}" + Random + PlaylistPlay
ovl.hidden = false
ovl.update()

image

On the website https://opentype-js.vercel.app/docs/glyph-inspector.html, search for the icon characters present in the font. The Random char with a length of 1 and PlaylistPlay with a length of 2. While Random can render correctly, there is a rendering failure for PlaylistPlay. I'm not sure if it's an issue with mpv or libass. image

Place the font files into the "fonts" directory and create a file named "test.js" in the "scripts" directory with the content from the code above. Run the player.

image

Expected behavior

Both chars can render correctly,

Actual behavior

Only Random rendered correctly.

Sample files

FiraCodeNerdFont-Regular.zip log.txt

avih commented 9 months ago

Please post a log file like the issue template requires.

[osd/libass] fontselect: failed to find any fallback with glyph 0xF0411 for font: (FiraCode Nerd Font Reg, 400, 0)

The codepoint U+F0411 is correct for "󰐑", so libass got the value correctly, so it's not an issue on the JS side, and probably also not at the mpv side.

No idea for now why libass doesn't find that glyph. Maybe it didn't pick up the font file correctly, maybe it's because it's in the private glyphs area, or some other reason.

To narrow it down, you can:

Try that also in lua just to rule out JS.

Try with a more common "length 2" icon, like "😀" which is more likely to be found in other fonts (length 2 is because JS strings are stored as UTF-16, and codepoints above U+FFFF are composed of two UTF-16 values) to rule out a specific problematic glyph.

avih commented 9 months ago

The Random char with a length of 1 and PlaylistPlay with a length of 2

Are you 100% sure that this file printed length 1 and then length 2?

When I copied your sample script to a UTF-8 js file ran it in mpv, it's length 1 for me for both glyphs. (there's some hack with mujs about UTF8 source of and non-BMP string length, it should be 2, but it's actually 1 in mujs because it stores codepoints and not UTF-16, however, it should print and sent to libass correctly)

Try with a more common "length 2" icon, like "😀"

I added a variable with this smiley icon (U+1F600, which is "length 2" in other JS engines, but 1 in mujs) and printed its length and added it to the overlay, and it prints three times "1" (for the length), and then two "not-found-glyph" boxes (because I didn't setup the nerd font), and then the smiley just fine.

So currently my guess is that either the font file was not picked correctly, or the glyph in that font file is in color - which libass doesn't support currently.

Please also try your example in lua.

I'm attaching my source test file, feel free to report what it prints and what it shows on screen: mpv-non-bmp.zip

ahaoboy commented 9 months ago

The execution results of Lua and JavaScript are the same; 😀 can be rendered correctly.

local ovl = mp.create_osd_overlay("ass-events")
local Random = "😀"
local PlaylistPlay = "󰐑"

ovl.data = "{\\fnFiraCode Nerd Font Reg}" .. Random .. PlaylistPlay
ovl.hidden = false
ovl:update()

image

ahaoboy commented 9 months ago

Are you 100% sure that this file printed length 1 and then length 2?

It seems to be an issue with mujs's UTF-8 handling. The character length is obtained through node. I have tried your code and other double-byte characters, and in both mujs and mpv, the length is 1. However, libass retrieves the correct Unicode encoding, but it doesn't render correctly. This is a bit strange.

avih commented 9 months ago

It seems to be an issue with mujs's UTF-8 handling.

No, it's an issue with your bug report.

If you report that you run this file (in mpv, because it doesn't run elsewhere due the mpv calls in it) and it prints 1 and then 2, but in practice it prints 1 and then another 1, then your report is wrong.

Anyway, I explained why it's 1 and not 2 in mujs/mpv, but also noted that it shouldn't be an issue, and indeed the fact that it's the same in lua, and that it does work (in JS) with other "length 2" icon (the smiley), means that it's not a JS issue and also not a "length 2 icon" issue.

So my previous guess remains:

So currently my guess is that either the font file was not picked correctly, or the glyph in that font file is in color - which libass doesn't support currently.

There might be other reasons why libass is not able to find that glyph. The full log file might help.

ahaoboy commented 9 months ago

😀 is not in the Fira font; instead, it is rendered using the fallback font.

[   0.114][v][osd/libass] fontselect: (sans-serif, 400, 0) -> ArialMT, 0, ArialMT
[   0.114][v][osd/libass] fontselect: (FiraCode Nerd Font Reg, 400, 0) -> FiraCodeNF-Reg, 0, FiraCodeNF-Reg
[   0.114][v][osd/libass] Glyph 0x1F600 not found, selecting one more font for (FiraCode Nerd Font Reg, 400, 0)
[   0.115][v][osd/libass] fontselect: (FiraCode Nerd Font Reg, 400, 0) -> SegoeUIEmoji, 0, SegoeUIEmoji
[   0.118][i][osd/libass] Glyph 0x1F600 not found, broken font? Trying all charmaps
ahaoboy commented 9 months ago

There might be other reasons why libass is not able to find that glyph. The full log file might help.

Characters should be included in the font file; otherwise, the online website may not be able to search for them, and it could also be a font picking problem.

var ovl = mp.create_osd_overlay("ass-events")
var c = "󰐑"
ovl.data = "{\\fnFiraCode Nerd Font Reg}" + c
ovl.hidden = false
ovl.update()

log.txt

avih commented 9 months ago

@astiob any thoughts on this?

I think the font file seems to loaded at the log:

[   0.006][v][osd/libass] Loading font file 'C:/app/mpv-test/portable_config/fonts\FiraCodeNerdFont-Regular.ttf'
[   0.007][v][osd/libass] Using font provider directwrite (with GDI)
[   0.007][v][osd/libass] Done.

But then

[   0.110][v][osd/libass] fontselect: (FiraCode Nerd Font Reg, 400, 0) -> FiraCodeNF-Reg, 0, FiraCodeNF-Reg
[   0.110][v][osd/libass] Glyph 0xF0411 not found, selecting one more font for (FiraCode Nerd Font Reg, 400, 0)
[   0.111][i][osd/libass] fontselect: failed to find any fallback with glyph 0xF0411 for font: (FiraCode Nerd Font Reg, 400, 0)

And OP claims that U+F0411 really exists at this file...

avih commented 9 months ago

😀 is not in the Fira font

That's the most common smiley codepoint. Are you 100% sure this glyph (U+1F600) is not at this font file?

Or are you saying that in the mpv log it's not found at the font?

Can you please check "😀" using the same method by which you declared that the other glyph (which missing in mpv) does exist at this font?

ahaoboy commented 9 months ago

Are you 100% sure this glyph (U+1F600) is not at this font file?

FiraCodeNerdFont-Regular.zip I can only confirm that 😀 is not in Fira, and the OpenType tool can correctly locate both of these glyphs. image

[...font.glyphs].filter(i=>i.unicode  === "😀".codePointAt(0))[0]
[...font.glyphs].filter(i=>i.unicode  === "󰐑".codePointAt(0))[0]
[...font.glyphs].filter(i=>i.unicode  === "".codePointAt(0))[0]

VsCode is good! image

astiob commented 9 months ago

Intro about terminology: “characters with multiple code points” doesn’t make sense. What is here being called a “character” is a code point, that is, a single member of the Unicode character set: U+F0411. I suspect that what’s meant is “code points (characters) that, when encoded in UTF-16, use up multiple code units”. But this is very verbose, and seeing as UTF-16 isn’t even used here at all, the more common and appropriate term for this is “supplementary-plane code points (characters)”. (You’ll note that codePointAt(0) returns the whole, single code point.)

(Side note: I don’t know anything about mujs or mpv’s JavaScript support, but if it reports a length of 1, it violates the ECMAScript spec and effectively fails to be JavaScript, which I can’t in good faith imagine being desirable. This very report illustrates this.)

Anyway: I suspect that libass wrongly selects the base-plane-only cmap in the font and that this has the same cause as https://github.com/libass/libass/issues/634. I might have a quick fix for this.

avih commented 9 months ago

(Side note: I don’t know anything about mujs or mpv’s JavaScript support, but if it reports a length of 1, it violates the ECMAScript spec and effectively fails to be JavaScript

Yes, this is the hack I mentioned earlier. There's some dichotomy with mujs regarding how a string is stored and how its length is counted. For the most part it behaves as expected, but indeed, if you try to address indivudual UTF-16 values for a non-BMP codepoint, things could get a bit wonky.

However, that's besides the main issue here, and doesn't affect the main issue at hand, other than reporting an unexpected length of 1 where standard ES5 should have reported 2.

Anyway: I suspect that libass wrongly selects the base-plane-only cmap in the font and that this has the same cause as libass/libass#634. I might have a quick fix for this.

Not sure I follow, but notice that other non-BMP codepoints, such as U+1F600 😀, do work corrctly.

astiob commented 9 months ago

notice that other non-BMP codepoints, such as U+1F600 😀, do work corrctly.

I don’t see this: U+1F600 doesn’t exist in this font.

avih commented 9 months ago

notice that other non-BMP codepoints, such as U+1F600 😀, do work corrctly.

I don’t see this: U+1F600 doesn’t exist in this font.

I meant that non-BMP codepoint pass correctly from JS to libass, so it's not an issue of libass not being able to handle it (to be honest, I couldn't quite figure out what that linked libass issue means, but just in case it meant that libass has issues with non-BMP codepoints, I demonstrated that other codepoints do work).

astiob commented 9 months ago

Well, JS passes UTF-8 to libass. You’ve demonstrated that this UTF-8 encoding is correct (which is, of course, good to know), but not that any other non-BMP codepoint is displayed in the same font, hence my confusion. If the issue is what I think it is, then no supplementary-plane content is currently rendered in this font.

@ahaoboy If you can recompile your mpv’s libass, please try the code in libass/libass#729 (astiob/libass@full-unicode-cmap).

avih commented 9 months ago

Well, JS passes UTF-8 to libass. You’ve demonstrated that this UTF-8 encoding is correct (which is, of course, good to know) but not that any other non-BMP codepoint is displayed in the same font

Indeed. I didn't realize this could be an issue. My point was only regarding non-BMP codepoints in general from JS to libass, and for libass to be able to read and eventually render a non-BMP codepoint value correctly, as demonstrated by some other non-BMP codepoint.

So what does https://github.com/libass/libass/pull/729 mean in plain English? that non-BMP don't always work with explicit font unless it needs to find a fallback? (this is a guess only from the circumstances of this issue)

astiob commented 9 months ago

So what does libass/libass#729 mean in plain English? that non-BMP don't always work with explicit font unless it needs to find a fallback? (this is a guess only from the circumstances of this issue)

It means that non-BMP can’t be displayed using some fonts. It depends on the particular font’s details. If the font is affected, the character will be displayed in a fallback font. Sometimes, if the font is installed system-wide, it will itself be selected as the fallback font (which normally shouldn’t happen but is possible here due to this same bug) and the glyph will be displayed correctly. In other cases, no fallback font will be found and no valid glyph will be displayed at all.

avih commented 9 months ago

It means that non-BMP can’t be displayed using some fonts. It depends on the particular font’s details...

Much appreciated.

ahaoboy commented 9 months ago

@ahaoboy If you can recompile your mpv’s libass

It works! image

avih commented 9 months ago

It works!

It seems to have got the glyph OK, but I don't think this is how it should be rendered.

It looks to me like the rendering overlaps the glyphs, while it shouldn't (when displaying an ass overlay which has these two codepoints one after the other at the same string).

But that's probably a different (libass) issue.

ahaoboy commented 9 months ago

This may be due to different rendering methods of various fonts. If a monospaced font is used, it might look better, and this overlapping issue also occurs in VSCode.

FiraCodeNerdFontMono-Regular.zip

image FiraCode Nerd Font Reg image FiraCode Nerd Font Mono Reg image

avih commented 9 months ago

This may be due to different rendering methods of various fonts

Perhaps, but it doesn't make it right, unless the font itself is broken (in that the glyph width is not communicated correctly, or not decided correctly, before rendering).

But again, this is a different (libass) subject.

astiob commented 9 months ago

The fix has been merged into libass.


The overlapping glyphs are part of the font. It’s not a good font. These two glyphs also have opposite winding orders, so libass is already being too kind; other rendering engines such as VSFilter show this instead:

The areas where the two glyphs overlap appear as holes, because the glyphs cancel out.

Whenever something is weird with a font, check the font before suspecting the renderer.

ahaoboy commented 5 months ago

mpv v0.38.0-340-g7923a633 already fixed