steven0lisa / flying-saucer

Automatically exported from code.google.com/p/flying-saucer
0 stars 0 forks source link

Line breaking in non-white space scripts like CJK [R9 deferred] #36

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
*** This issue was imported from http://java.net/jira/browse/XHTMLRENDERER-141

It was reported by andrei3k on 10.11.2006 11:51:49 +0100 and last updated in 
the previous bug tracker on 27.01.2008 18:51:18 +0100

Found in
Operating System: All
Platform: All

The priority for this issue at migration was Blocker.

Original description: 
Overview:
"Chinese and Japanese scripts do not delimit words with spaces, and wrap on a
character-by-character basis. There are, however, some rules (called kinsoku
rules in Japan) that forbid certain characters (mostly final punctuation) from
appearing at the beginning of a line, and others that forbid certain characters
appearing at the end of a line."
cite:
http://www.w3.org/International/tutorials/css3-text/

Text with no whitespace doesn't offer any line breaking oppurtunities for
current FS text breaking implementation.

For testing:
http://www.icareus.net/temp/xhtml/simple_template_example.xhtml 
Tested document has both Latin and Chinese text.

http://www.bbc.co.uk/worldservice/languages/
Source foreign languages.

Results in FS:
Chinese text is not wrapped except when there's some Latin with whitespace
between. Same goes to too long Latin text without any whitespace.

Expected Results:
Chinese text is wrapped according to kinsoku rule. But too long Latin text
without any whitespace isn't wrapped and is kept on one line which when exceeds
parent box width will overflow.

Version:
R7 pre-1

Original issue reported on code.google.com by pdoubl...@gmail.com on 16 Feb 2011 at 9:47

GoogleCodeExporter commented 9 years ago
pdoubleya wrote on 27.01.2008 18:51:18 +0100:
This probably can be solved by using the line-break tools packaged in the JDK
which already encode the line-break rules for different character sets.
Deferring to R9.

Original comment by pdoubl...@gmail.com on 16 Feb 2011 at 9:47

GoogleCodeExporter commented 9 years ago
Replace and re-compile the attached modified Breaker.java can fix the CJK 
line-break problem.

Original comment by rae.w...@gmail.com on 4 Aug 2012 at 10:35

Attachments: