love2d / love

LÖVE is an awesome 2D game framework for Lua.
https://love2d.org
Other
5.04k stars 400 forks source link

Word wrap for CJK should not put punctuation at the front of a line #1159

Closed slime73 closed 8 years ago

slime73 commented 8 years ago

Original report by David Frank (Bitbucket: bitinn, GitHub: bitinn).


Now that we do have proper support for UTF-8 and Latin word wrap, I wonder if we can avoid wrapping CJK punctuation to the front of a line.

Screen Shot 2016-04-10 at 4.42.17 PM.png

It was discussed 5 years ago, the old issue mainly concerns supporting word wrap itself, which is supported as of v0.10. #300

slime73 commented 8 years ago

Original comment by David Frank (Bitbucket: bitinn, GitHub: bitinn).


Just note that CJK punctuation word wrap rule for . , ! ? are pretty much the same as English, " ' are allowed to start a line of course.

I will link to some reference when my network is not damn slow...

slime73 commented 8 years ago

Original comment by Bart van Strien (Bitbucket: bartbes, GitHub: bartbes).


Latin word wrap? Is this a feature someone sneakily put in? As far as I know we only do hard wrapping.

slime73 commented 8 years ago

Original comment by David Frank (Bitbucket: bitinn, GitHub: bitinn).


I mostly mean wrap on whitespace and force word break when they exceed wrap limit, obviously no auto-hyphenation or anything... I am not able to reproduce an instance where Latin punctuation starts of a line, maybe there are edge-cases where a simple rule doesn't work...

I think the difficulty of implementing CJK wrap comes from not having UTF-8 pattern matching support in Lua? If you can reliably distinguish a UTF-8 punctuation, writing a word-wrap strategy shouldn't be too difficult, right?

Good news are the vocabulary of punctuation isn't that large, and rules are relatively simple:

https://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages

It's not game-breaking for me, but nice to have so we don't have to bug script write on technical issues.

slime73 commented 8 years ago

Original comment by Alex Szpakowski (Bitbucket: slime73, GitHub: slime73).


I'm not sure I understand. The hard wrapping that only occurs when there's a line with no spaces in it that exceeds the wrap limit doesn't discriminate at all when splitting the line in two. Do you want it to?

slime73 commented 8 years ago

Original comment by David Frank (Bitbucket: bitinn, GitHub: bitinn).


yes browser knows CJK text has no whitespace word boundary, so they use UTF-8 boundary and check for punctuation rulesets on whether they can be wrapped or not.

basically this will add some complexity to your hard wrapping logic as you can't simply wrap based on glyph width and printf wrap limit, you have to know whether you are wrapping on punctuations and act accordingly.

Given your explanation I would guess LOVE will wrap a latin comma to the start of a line if it happen to exceed the hard wrap limit too? That might be undesirable for text-heavy games, like visual novels or text adventures.

slime73 commented 8 years ago

Original comment by Alex Szpakowski (Bitbucket: slime73, GitHub: slime73).


Ideally I'd like to keep LÖVE's wrapping code as simple as possible, it's already fairly complex for what it is. I don't have any desire to change it further myself, but I'd accept pull requests that make it better without making it much more complex.

slime73 commented 8 years ago

Original comment by Gabe Stilez (Bitbucket: z0r8, ).


An alternative solution would of course be to either code it yourselves, or use a lib (if one exists, if not, why not make one?)

By the way, from a quick google, not even Ren'py, which is a visual novel specialized engine, has specialized word wrapping implemented by default, it only wraps at spaces/newlines.

slime73 commented 8 years ago

Original comment by David Frank (Bitbucket: bitinn, GitHub: bitinn).


For the record, I think Ren'py do support that. It wourld be pretty crazy if it didn't (as most engine for VN making purpose do support that)

https://www.renpy.org/doc/html/style_properties.html#style-property-language

I don't think this will be simple, but it's certainly doable, as you may find how browser do it here:

https://drafts.csswg.org/css-text-3/#line-breaking

I am not that fluent in C, but can this be a part of LOVE?

https://luapower.com/libunibreak

slime73 commented 8 years ago

Original comment by David Frank (Bitbucket: bitinn, GitHub: bitinn).


Just realize if I do printf wrapping with English text and create a typing effect similar to those you find in a visual novel, longer word at the end of a line will first be displayed on the same line and then be wrapped to the next line.

I guess those visual novel engines spend quite some more effort on text layout than I originally expect...

slime73 commented 8 years ago

Original comment by Bart van Strien (Bitbucket: bartbes, GitHub: bartbes).


Currently word wrapping is consistently and language-agnosticly (that's a word now) poor. I don't see a reason to specifically support any specific languages at this point, but as @slime73 said, pull requests are welcome.

slime73 commented 8 years ago

Original comment by airstruck (Bitbucket: airstruck, GitHub: airstruck).


Here's 0.9.x-style text wrapping in Lua if anyone needs it. There's no hard wrapping; each line gets its own 'length' field. It does not condense spaces (like 0.10.x behavior; 0.9.x condensed spaces).

https://gist.github.com/airstruck/d84855ced90bc96c7c31b95bf0fa833a