Closed jean-airoldie closed 1 year ago
More generally, is there a way to retrieve a unicode character range from a glyph range? Such as when multiple glyphs are selected on screen using the mouse.
rustybuzz is a very low-level library and it doesn't provide anything like that. It's a job of a text layout library.
Also, I don't think it's possible to convert glyph ID back to Unicode. Shaping is a one way process. You can match the original string characters using clusters, but that's about it. I think we had a similar question: #51
I am trying to tell apart new line characters \n from other characters without a glyph ID.
I'm not sure, but I think you should split the input string into lines before passing it to the shaper. rustybuzz operates on a single line of text.
I personally use rustybuzz only for static text layout, therefore I cannot comment on the interactive use case.
Ok, that's what i thought.
Also, I don't think it's possible to convert glyph ID back to Unicode. Shaping is a one way process.
I would be possible if rustybuzz kept track of the original unicode range associated to each glyph, and returned it in the GlyphInfo
struct. In that case I would be able to refer back to the original string and detect that a unknown glyph is actually a \n
for instance. However I understand that's probably out of scope of this project since you are aiming to follow haffbuzzz's design.
I'm not sure, but I think you should split the input string into lines before passing it to the shaper. rustybuzz operates on a single line of text.
That's would solve my new line character issue, but would still be a pain to deal with (and slow). And I still wouldn't be able to retrieve the unicode range anyway.
I would be possible if rustybuzz kept track of the original unicode range associated to each glyph, and returned it in the
GlyphInfo
struct. In that case I would be able to refer back to the original string and detect that a unknown glyph is actually a\n
for instance. However I understand that's probably out of scope of this project since you are aiming to follow haffbuzzz's design.
HarfBuzz does this, in it's cluster
member. I believe rustybuzz does the same.
This is honestly out of my area of expertise. rustybuzz is harfbuzz in Rust. If you want to do something unusual with it - try doing it with harfbuzz first. If it's not possible in harfbuzz then it will not be possible in rustybuzz either. There are no plans on having any additional features beyond what harfbuzz already provides.
There are no plans on having any additional features beyond what harfbuzz already provides.
Ok.
HarfBuzz does this, in it's cluster member. I believe rustybuzz does the same.
Indeed rustybuzz
does have a cluster
member, but I wasn't aware that it referred to the unicode graphene cluster, although that makes sense thinking back. I'll try it out to see if it fixes my issues.
Indeed
rustybuzz
does have acluster
member, but I wasn't aware that it referred to the unicode graphene cluster, although that makes sense thinking back.
It points back to the index in the original text string corresponding to the start of the current cluster. In your case, it should point out to the location of the \n
.
It points back to the index in the original text string corresponding to the start of the current cluster. In your case, it should point out to the location of the \n.
Yes that should work then. In the case of ligatures and complex clusters and can deduce the unicode range by looking at the cluster index of the next glyph, or the end of the string, if there is no next character.
Correct.
Thanks a lot!
I'll submit a PR later to make this clearer in the doc.
Yeah, I meant the rustybuzz
doc.
Hi,
I am trying to tell apart new line characters
\n
from other characters without a glyph ID. Because of potential ligatures, I can't really map the glyph index in the buffer (or the glyph cluster) to a unicode character. Is there a way to do this that I am missing?My use case is that I do font shaping once and then layout the text. The new line characters are used to force a line break.