sile-typesetter / sile

The SILE Typesetter — Simon’s Improved Layout Engine
https://sile-typesetter.org
MIT License
1.66k stars 98 forks source link

shaper:measureChar can return the width of a .notdef character #1706

Open Omikhleia opened 1 year ago

Omikhleia commented 1 year ago

I had some surprise getting a width from calling SILE.shaper:measureChar() with a character not existing in my font. (It existed in another font I previously used.)

Investigating, the measured character is a .notdef one in that font, i.e. the function does shape something and silently returns a value, but is it not what one may expect (but, likely, the width of a default square or of a :skull_and_crossbones: or whatever the font returns here).

See:

https://github.com/sile-typesetter/sile/blob/607dcf7b3d8a83547779758cb70f6285a07c4848/shapers/base.lua#L67-L69

Shouldn't this be:

  if #items > 0 then
    if items[1].gid == 0 or items[1].name == ".null" or items[1].name == ".notdef" then
      SU.error("Unable to measure character", char)
    end
    return { height = items[1].height, width = items[1].width }
  else

N.B. The proposed extra check is what the fallback shaper does...

https://github.com/sile-typesetter/sile/blob/607dcf7b3d8a83547779758cb70f6285a07c4848/shapers/fallback.lua#L104

... to actually ensure the shaped token(s) lead to valid glyph(s).

(EDIT: cleaned up the proposed code, I had left some stuff from earlier debugging)

Omikhleia commented 1 year ago

N.B. The proposed extra check is what the fallback shaper does...

... Which might be a harfbuzz-specific approach, however (i.e. I am not sure other shapers are expected to return a "gid' or "name" in the shaped items, so it might not be that appropriate to have it in the "base" class)?

khaledhosny commented 1 year ago

I'm not familiar with this API, but it doesn't feel like a good idea to error here unless other SILE shaping code also errors for missing characters. Also it depends on what this API is supposed to do, if .notdef is what is going to be displayed, then it seems appropriate that measuring gives its dimensions.

Omikhleia commented 1 year ago

@khaledhosny It doesn't feel good either to return the width of a .notdef for measurement of something that might not make it to ouput (depending on why things are measured, and depending too on whether there are font-fallbacks that apply, but these are not checked here either...)

An example use is here #1705 where I'd prefer the explicit warning handling if the font doesn't have the "numeric / figure space" character[^1] ... The zenkaku "zw" unit also has the same type of problem (i.e. it didn't warn, but measured .notdef too).

[^1]: EDIT I intend using such a measure unit for e.g. leaving "decently" enough room for a certain number of digits in some environments (footnotes, table cells, &c). The displayed content will be measured (via an hbox, whatever it contains, digits or not, font-fallbacks or not, possibly some font changes such as +onum...) and if it overlaps the "reserved" space, I can still take decision on what to do at that time.

Omikhleia commented 1 year ago

Re: "The zenkaku unit has the same issue"

units["zw"] = {
  relative = true,
  definition = function (v)
    local zenkakuchar = SILE.settings:get("document.zenkakuchar")
    local measureable, zenkaku = pcall(SILE.shaper.measureChar, SILE.shaper, zenkakuchar)
    if not measureable then
      SU.warn(string.format([[Zenkaku width (全角幅) unit zw is falling back to 1em == 1zw as we
  cannot measure %s. Either change this char to one suitable for your
  language, or load a font that has it.]], zenkakuchar))
    end
    local width = measureable and zenkaku.width or SILE.settings:get("font.size")
    return v * width
  end
}

From the warning text it seems that the intent was to warn if the specified character couldn't be measured (rather than a measure succeeded, but actually on something else). :question:

Your point is sound too, though. scratches head

Omikhleia commented 1 year ago

Interestingly, the shaper:measureChar() method is only used in units.lua at this date (for measuring an x, a space, the zenkaku char). (EDIT: unless mistaken, and only taking into account the main SILE distribution... E.g. I have used it in 3rd party packages (for measuring a 0, predating my attempt with a figure space), perhaps other package/class designers used it too.

ctrlcctrlv commented 1 year ago

Maybe we want another function: measureString vs. measureChar. I agree with Khaled that measureChar should measure .notdef when char is not in font.

Omikhleia commented 1 year ago

... Or it could return, besides the width/height, something so that the caller knows what was measured, perhaps?

ctrlcctrlv commented 1 year ago

That would work too. You could return an object having glyphname and codepoint if available.

Omikhleia commented 2 weeks ago

I recently touched that measureChar function (to also return the depth of the measured character, in #2141) - So self-assigning this issue before I forget again about it....