Open yusdacra opened 2 years ago
I did a similar experiment in https://github.com/emilk/egui/pull/698. The main difference is that I opted for &'static str
because A) I wanted to avoid lifetimes and B) the galley cache used to store the text (but now it just stores the hash, so this is no longer a problem).
It did improve performance by quite a bit, but ergonomics became worse.
Adding lifetime parameters to everything is not very appealing to me. Especially it is nice to be able to store a RichText
between frames, so we then need some way to convert RichText<'a>
to RichText<'static>
via some .into_owned()
method.
There are some other alternatives to consider, such as using string-interning with ref-counting (e.g. using https://docs.rs/refcount-interner/latest/refcount_interner/). That would avoid lifetimes, but at the cost of an extra hashing.
String interning would be nice, I was also thinking utilizing types like SmolStr
maybe also could help (eg. store up to 22 bytes inline, otherwise use a Rc<str>
, which <22 bytes sounds like it would be very common considering all the buttons etc.).
I also agree that ergonomics would be hurt with lifetimes, my main problem with it is new users of Rust may find it complex. But glad to see that it was already considered.
It'd be nice if there was a convenient way to opt in to something like Cow<'static, str>
- not sure how large the interface is for hiding this behind a feature flag. Native platforms may be able to cope fine with the performance hit, but this hurts hard in WASM, especially if there's other computationally heavy work to be done (like in an emulator). GC pauses really eat into the frame time. It's not as huge of a deal for purely UI apps, but if you try and add realtime audio, it's quite problematic.
The only alternative right now is to pause any playing audio while the UI is visible or let the audio crackle from time to time.
I think there's a strong argument for at least using Cow<'static, str>
rather than String
, because it allows the flexibility for users to (when necessary for performance) manage lifetimes themselves. For instance, I have a use case where I have long strings of input text which need styling done throughout, and cutting them up into many newly allocated String
s will be detrimental.
My use case may, in the future, include those strings dynamically changing as well (additions/deletions at arbitrary locations). In that case I would ideally use something like https://crates.io/crates/ropey, however that wouldn't work here without using something like Box<dyn Iterator<Item = char>>
. Of course that would necessitate a bunch of allocations for the trait objects (so perhaps only a bit better than a bunch of String
allocations), however you could create a type which is essentially a stack-allocated trait object up to a certain size. Note that in this particular example, one could also get the contiguous str
chunks of each RopeSlice
and create multiple identical LayoutJob
s for them instead, which I suppose is another way to go. Maybe Vec<Cow<'static, str>>
is an okay compromise of allocation use cases...
Another approach is to potentially have a "string manager" that is used across all layout calls, and that manager can yield identifiers which are stored in RichText
/LayoutJob
(so those don't need any allocated objects stored). There could be a default manager supporting String
. Such an interface would allow a lot of flexibility for users to bring their own string types, allocation schemes, etc.
I've just quickly implemented a string manager scheme in epaint
as a proof of concept, and tests are passing too. The current design is a bit rough, and some changes could be made:
&impl StringManager
argument), but those could be changed to &dyn StringManager
with some rearrangement, if the generic functions are offensive for whatever reason. The trick there is that there's an associated type, so that'd need to likely change to a trait object too. I'm not immediately sure how much performance impact that would have.StringManager
through all APIs, it could live in Fonts
(with Fonts
appropriately renamed to something like TextContext
). In this case it wouldn't be useful for it to be anything other than a trait object, so the above changes would have to occur.StringId
s from a previous frame (specifically tied to when Fonts::begin_frame
is called, which flushes the GalleyCache
). If Fonts
becomes the thing that stores the StringManager
, the trait could be extended to be informed of this event more closely.I don't have the repo forked to push my changes up, but if this API is of interest I will definitely do so.
Another approach is to potentially have a "string manager" that is used across all layout calls, and that manager can yield identifiers which are stored in
RichText
/LayoutJob
(so those don't need any allocated objects stored).
This is a great idea, similar to managing textures, given the sheer number of small allocations large UIs can require for strings. Hopefully the design could be generic enough to allow using a per frame bump arena allocator - as strings can change a lot more frequently than textures and a bulk allocation per frame is preferable to thousands of individual ones strewn throughout.
I'd love to see that POC!
It's a fairly straightforward change! https://github.com/emilk/egui/compare/master...afranchuk:egui:string-manager
It's a fairly straightforward change! master...afranchuk:egui:string-manager
Indeed it is! Looks great and seems like a design could be compatible with a typed arena for strings.
Why the char iterator design instead of a method to return &str? It precludes using byte range which I assume is important for some features like horizontal scrolling.
Edit: oh because of the ropey use case? How would that work with scrolling byte offsets?
Yeah the main reason was the ropey (or other string type) use case, though it's also the minimum interface necessary to abstract the call site (i.e., the layout code loops over char
). I've not looked at the scrolling code so I can't speak to that yet. You might have noticed that I found I had to move the text: StringId
into the Vec<LayoutSection>
rather than use byte ranges (because I didn't have the text length readily available). That was done without looking at how that might be used externally, so it's possible that'll have to change.
Layout code should iterate over a higher layer of abstraction than char
. For instance, grapheme clusters or word boundaries via unicode_segmentation
.
The trait it provides is probably not the right interface for being generic over strings, though. Deref<Target = str>
might be better suited.
I'm open to suggestions, I was just making the change that abstracts the existing code. But it seems like that concern might be (at this time) unrelated to this change, as the layout code doesn't seem to care about the str
other than iterating over chars()
.
That is correct at present. I don't want to derail this thread, but that has been a concern for quite some time. #2532 has details. Providing an interface for &str
completely avoids it, ergo my suggestion.
That is correct at present. I don't want to derail this thread, but that has been a concern for quite some time. #2532 has details. Providing an interface for
&str
completely avoids it, ergo my suggestion.
Haha, I am also working on a terminal-emulator-like project (per your comment in #2532) :)
Currently, a
RichText
(and alsoLayoutJob
) holds aString
to store the data. I propose we change these to take a new enum type, withSingle(&str)
,Multiple(&[&str])
and anOwned(String)
variants. The lifetimes should essentially be generic; so users can pass any string they want, without creating a new string. This would get rid of creating a new string 60 times per second for all labels / buttons etc. which could add up a lot if there is a lot of different widgets that have text.RichText
would essentially use theSingle
variant. TheMultiple
variant is for the macro mentioned inLayoutJob
docs, it should allow the user to style text without needing to create a new string.The downside is that this requires putting lifetimes everywhere, which can be a PITA to look at / work with sometimes. The lifetimes wouldn't be visible to a user most of the time, since widgets are created and used immediately and not stored. To make sure this is worth the effort, this should be implemented first and then benchmarked in different scenarios. I may try to implement this later, but I'm putting this out there so that I don't forget it.