Closed rparrett closed 4 months ago
I'm surprised by this too tbh.
The default line breaking logic is provided by xi-unicode crate and it is simply telling us that "test .test"
has no line breaks in it.
// [dependencies]
// xi-unicode = "0.3"
fn main() {
println!("==> test test");
for line_break in xi_unicode::LineBreakIterator::new("test test") {
println!("{line_break:?}");
}
println!("==> test .test");
for line_break in xi_unicode::LineBreakIterator::new("test .test") {
println!("{line_break:?}");
}
}
Output:
==> test test
(5, false)
(9, false)
==> test .test
(10, false)
Perhaps you should raise this issue upstream.
Thanks for the pointer. I'll look into whether or not this is a bug on their end or an expected behavior according to https://unicode.org/reports/tr14/ which they are following.
I think it's likely that this is expected under UAX14. It would seem to be covered by LB13, with .
being classified as an infix separator (IS
) and preventing breaks before it, even if they are spaces. Though the standard is not super readable so I could be mistaken.
Out of curiosity I checked unicode-linebreak
(used by cosmic-text
and lapce
), and it behaves the same way, so maybe I am not mistaken.
// [dependencies]
// unicode-linebreak = "0.1.5"
fn main() {
println!("==> test test");
for line_break in unicode_linebreak::linebreaks("test test") {
println!("{line_break:?}");
}
println!("==> test .test");
for line_break in unicode_linebreak::linebreaks("test .test") {
println!("{line_break:?}");
}
}
==> test test
(5, Allowed)
(9, Mandatory)
==> test .test
(10, Mandatory)
I also checked icu::segmenter
(even with LineBreakStrictness::Loose
) and https://github.com/foliojs/linebreak just in case.
Some relevant/helpful text from a Go implementation:
The goal of matching user perceptions cannot always be met exactly because the text alone does not always contain enough information to unambiguously decide boundaries. For example, the period (U+002E FULL STOP) is used ambiguously, sometimes for end‐of‐sentence purposes, sometimes for abbreviations, and sometimes for numbers.
Every browser I can test doesn't seem to exhibit this behavior though, but I believe browsers are also doing some fancy stuff like switching break mode based on container width.
Thanks for the help.
Testing with 74c12ade4466b08b68d7703ec9a0081da39d9eac.
Reported in https://github.com/bevyengine/bevy/issues/12098.
What I did
Layout
test .test
in a space where only one of those will fit on a line.What I expected
Two lines
What I got
Output
``` "t" @ -60,12 "e" @ -54,12 "s" @ -46,12 "t" @ -39,12 " " @ -34,12 "." @ -30,12 "t" @ -26,12 "e" @ -20,12 "s" @ -12,12 "t" @ -5,12 ```Removing the period results in:
Output
``` "t" @ 0,12 "e" @ 5,12 "s" @ 13,12 "t" @ 20,12 " " @ 26,12 "t" @ 0,28 "e" @ 5,28 "s" @ 13,28 "t" @ 20,28 ```Repro
Expand
```rust use glyph_brush_layout::{ab_glyph::*, *}; fn main() { let dejavu = FontRef::try_from_slice(include_bytes!("../../fonts/DejaVuSans.ttf")).unwrap(); let fonts = &[dejavu]; let text = "test test"; let glyphs = Layout::default().calculate_glyphs( fonts, &SectionGeometry { screen_position: (0.0, 0.0), bounds: (50.0, 100.), }, &[SectionText { text, scale: PxScale::from(15.5), font_id: FontId(0), }], ); for glyph in glyphs { let character = &text[glyph.byte_index..glyph.byte_index + 1]; println!( "{:?} @ {},{}", character, glyph.glyph.position.x.round(), glyph.glyph.position.y.round(), ); } } ``` ## Discussion It's totally possible I'm just not understanding all the nuances of text wrapping again. I appreciate any insight you're able to provide.