Closed laurmaedje closed 3 years ago
Wow! Just wow! And the timing is perfect, since I've just finished tiny-skia
, which means that the next release of resvg
can be pure Rust (I will think what to do with AAT).
Harfbuzz did GDEF blocklisting for some fonts. Since I use ttf_parser's APIs for this, I couldn't port that.
You can do this using Face::table_data
. Or not?
This PR depends on a few changes in ttf_parser.
I will merge the next
branch to the master now and you could send a pull request with your changes.
I apologize for the PR being this huge
No problem. That was expected.
Everything seems ok.
You are using a few Rust tricks that I'm not usually using (like impl T
and Self::new
constructors), but this just a matter of preference. I don't mind. Overall, the code is pretty consistent and idiomatic. In the end, Rust is a pretty simple language, imo.
I guess we should decide what parts should be moved to ttf_parser
and what should we keep here. Looks like you only need the parser
module and most of the stuff from the ggg
module is in the ot/common
now. Am I right? Do you think there is a point or even a possibility to move GSUB/GPOS completely to ttf_parser
, maybe in the future?
PS: how much are you familiar with text rendering/layout? Because you did a pretty serious work on rb and ttf_parser and it's a very specific domain. To put in perspective, the current rb + ttf_parser implementation took me like 6 month. Mainly because I had no idea about shaping at all.
I've made the parser
module public. Let me know if you need any other code from the next
branch. If not, just make the pull request with all your changes.
Also, update readme and changelog.
Does this mean the the only things left is hb-ot-shape.cc
and AAT?
Yes the ggg
stuff in now in ot/layout/common
. With regards to the directory structure, I'm not sure whether the extra nesting level ot/layout
is really helpful or whether we should just move the files in layout
directly into ot
.
Moving GSUB/GPOS into ttf_parser
is hard. The top-level code (that you already had implemented) could of course live in ttf_parser
(at least once we can use NormalizedCoordinate
in rustybuzz). The parsing code could also be moved, but I don't really see how we could move the code that lives in the Apply/WouldApply
traits. That code is very specific to harfbuzz and needs tons of things we can't possible abstract over. So moving the parsing and the data structures would be okay, but that would also mean exposing all of these enums directly, which is not so nice and also probably not so great for compatibility reasons. Or we could design a separate small API for each kind of lookup and hide the enums in structs. But I'm not sure that's really worth it.
What is left now is
Font
and Face
and storing GSUB and GPOS in therehb-ot-shape
(plus rb_shape_plan_t
and rb_ot_complex_shaper_t
)hb-ot-map
To your other question: I also had no idea about shaping not too long ago. But I'm currently trying to develop a typesetting project that I want to run in WASM and this endeavour somehow led me here. And I had a lot of time on my hands for the past couple weeks. Porting the individual GSUB/GPOS tables was pretty frustrating at times, but tying everything together in the last few days all the more rewarding.
I need two other things from the next
branch:
Face::gdef_variation_delta
Offsets16
+ OffsetsIter16
with a public get
method (and OffsetsIter16
should probably return Clone and Copy, which it doesn't in next
)I'm not sure whether the extra nesting level ot/layout is really helpful
In my kerx branch, I have a tables
root module with tables parsing. So you can do the same and move the logic itself to the ot
directly.
Moving GSUB/GPOS into ttf_parser is hard.
Yeah... That why I was insisting on doing it on the rb side, since I've came to the same conclusion when I was trying to port GSUB/GPOS myself. This is not a priority, just a curiosity.
Moreover, I plan to move most of the kern
to the rb too. And leave only the basic OT kerning on the ttf_parser
side.
What is left now is
Yes, sounds about right. AAT is the only tricky one, but still way simpler than OT.
But it don't really get how the whole kerning thing works with OpenType and AAT.
It should work out of the box in the kerx
branch. At least kern
is fully finished. Including AAT one.
but tying everything together in the last few days all the more rewarding
Well, I just glad that I don't have to do this anymore :smile:
I need two other things from the next branch
Done.
Offsets16 + OffsetsIter16
Since it's not used by ttf_parser, I think we should implement it in rb directly. It doesn't rely on any ttf_parser internals.
If we put Offsets16
(and DynArray
) into rustybuzz, then we should probably have a StreamExt
trait so we can read them with normal methods on Stream, right?
And to the directory structure: Would you put the Apply
and WouldApply
implementations for the GSUB and GPOS tables alongside the parsing code into the tables module or somewhere in ot
? I'm not sure whether it's good to split it up - for the same reason its in rustybuzz and not ttf-parser - because it's tightly coupled.
And I'm wondering whether it makes sense to keep GDEF in ttf-parser or whether it should also live in rustybuzz.
StreamExt
Yep. That's the idea.
And to the directory structure
I would prefer to have parsing and shaping separated. Most of the parsing is already moved not only into a separate module, but into a separate crate. This was one of the main ideas behind this port. If a user doesn't need shaping, it can use just ttf_parser. It's not possible with hb.
Basically, the tables
module should not import Buffer
and shaping code should not import Stream
.
I've looked at the code and it seems that Apply
doesn't contain any low-level parsing code, so I think it should be split.
As for GDEF, it has a very simple query API, so it can stay in ttf_parser
. Maybe someone would find it useful.Basically, rb should do only one job - shaping. Everything else should be moved into separate crates.
Okay. Makes sense.
I addressed the feedback. Splitting the parsing into a tables
module was a good idea I think, it worked out well!
The only thing that's still not addressed is the GDEF blocklisting. It could probably somehow be done with table_data, but it's a bit complicated because while harfbuzz simply removes the table from its face, I would have to check for blocklisting at each access to a ttf-parser function that uses GDEF. There aren't too many of these calls, but it's not as nice as simply setting the table to None
(which I obviously can't since it's private to ttf-parser).
Just for the record, this does not need any changes to ttf-parser anymore, now.
I guess we can open an issue for GDEF blocklisting, to implement it later.
Is it fine by you if I squash commits?
Overall, you did an incredible work. And I still cannot believe that someone decided to dive into the nightmare called text shaping.
I think there are a few big chunks which would fit together nicely. So you could maybe squash into:
I've decided to merge it as is. Your commit history is too good to remove it.
Looks like there is a memory leak. You can test it yourself using: env RUSTFLAGS="-Z sanitizer=address" cargo +nightly test
No idea why CI didn't caught it.
Short question because I want to prevent duplicate work here. Are you currently/soon working on finishing the port? Because I could imagine doing some more work here, but I don't want either of us to waste their time.
No plans for the next few months.
Okay, then I'll maybe continue.
This ports all of GSUB and GPOS and uses
ttf_parser
's APIs for GDEF.Notable changes:
rb_ot_face_t
no longer knows about them.hb-ot-layout.cc
, now insrc/ot/layout/mod.rs
).hb-kern.hh
) because it depended onrb_ot_apply_context
andrb_ot_apply_context::skipping_iterator_t
which are stack-allocated and fully in Rust now, so FFI would have been complicated.hb_set_t
. This was apparently only used for lookup acceleration (basically collecting the coverage of lookups into sets), which I removed for now.ttf_parser
's APIs for this, I couldn't port that.Note that a lot of the original GSUB/GPOS code was unused from the beginning (originally used for harfbuzz's OpenType API). I removed all of that in the first commit.
This PR depends on a few changes in
ttf_parser
. Basically, two helpers to read data at an offset, making theget
method onOffsets16
public andDynArray
. These changes can be found here. For now, I rewiredttf_parser
in rustybuzz'sCargo.toml
to that fork. I didn't open a PR there for now because I'm not sure on how you plan to integrate withttf_parser
since thecrates.io
version doesn't have a publicparser
module.I apologize for the PR being this huge :/