tlwg / libthai

GNU Lesser General Public License v2.1
71 stars 18 forks source link

About Line Break of Unicode 11 support #6

Open epico opened 5 years ago

epico commented 5 years ago

I think libthai supports Thai word breaking, but when update pango to Unicode 11, the Line Break fails with the Unicode 11 test cases for Thai language.

I post some patch to fix the Thai language issue. Could you comment or review the patch? URL: https://gitlab.gnome.org/GNOME/pango/merge_requests/20

thep commented 5 years ago

What's the problem exactly, please? I can't find the information from the mentioned merge request.

epico commented 5 years ago

Please first merge the code from https://gitlab.gnome.org/GNOME/pango/merge_requests/15

Then download LineBreakTest.txt from Unicode website. https://www.unicode.org/Public/11.0.0/ucd/auxiliary/LineBreakTest.txt

Save LineBreakTest.txt to pango/tests sub-directory, then run ./testboundaries_ucd

It will complain about the failure of test cases.

epico commented 5 years ago

I updated the patch again, any comments?

sirn commented 5 years ago

@epico @thep looks like this patch is causing Thai word break to be completely broken. Please see the screenshot below. The screenshot is from Firefox, but I could replicate the same behavior with pango-view as well.

https://gitlab.gnome.org/GNOME/pango/issues/413

With the Unicode 11 changes

pango 1.44.5 on Arch Linux. Notice that line only breaks on whitespace instead of word boundaries:

Screenshot_2019-08-21 วิกิพีเดีย สารานุกรมเสรี(1)

With the Unicode 11 changes reversed

I did patch -R -p1 the patch in GitLab against 1.44.5 source code and now lines are break properly again.

Screenshot_2019-08-21 วิกิพีเดีย สารานุกรมเสรี

thep commented 3 years ago

@epico What's up with this issue? Do I have to change something on libthai side?

epico commented 3 years ago

Sorry, I misunderstood this issue in the past.

It seems word breaks in Thai language need to use some word dictionary.

But the official Unicode test cases do not use the word dictionary.

After inserted some libthai word breaks into pango line breaks, the Unicode test cases for line breaks fail.

Dunno whether it is possible to make the word breaks from libthai complies with the Unicode line break test cases from LineBreakTest.txt.

Anyway we disabled the LineBreakTest.txt test in pango, because the libthai result seems better.

sirn commented 3 years ago

I think the issue here is Unicode does not defines Thai letter as potential line break, only char break for word-wrapping. As per https://www.unicode.org/Public/11.0.0/ucd/auxiliary/LineBreakTest.html I think it might makes more sense for SA_AL (Thai) to have similar behavior to CJ_NS (Hiragana) rather than AL (Latin/Numbers). Not sure how would one raise this issue with Unicode consortium, though.

thep commented 3 years ago

AFAIK, I think the similar behavior of SA_AL (Thai) to AL is OK, as long as morphological analysis (like what libthai provides) is recommended, as said in UAX#14. At least, not all Thai characters are non-starters.

I don't know what exactly the problem is, and whether the test still fails in current version if added back. And if it does, what are the failed cases?

thep commented 3 years ago

Using Unicode 13 LineBreakTest.txt, here are the failed cases in Pango's testboundaries_ucd:

# Parsing line: × 0023 × 0E01 ÷ #  × [0.3] NUMBER SIGN (AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 193 failed
#    expected: × 0023 × 0E01 ÷  
#    returned: × 0023 ÷ 0E01 ÷ 
#    comments:   × [0.3] NUMBER SIGN (AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0023 × 0308 × 0E01 ÷  #  × [0.3] NUMBER SIGN (AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 195 failed
#    expected: × 0023 × 0308 × 0E01 ÷   
#    returned: × 0023 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] NUMBER SIGN (AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 00B4 × 0E01 ÷ #  × [0.3] ACUTE ACCENT (BB) × [21.04] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 709 failed
#    expected: × 00B4 × 0E01 ÷  
#    returned: × 00B4 ÷ 0E01 ÷ 
#    comments:   × [0.3] ACUTE ACCENT (BB) × [21.04] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 00B4 × 0308 × 0E01 ÷  #  × [0.3] ACUTE ACCENT (BB) × [9.0] COMBINING DIAERESIS (CM1_CM) × [21.04] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 711 failed
#    expected: × 00B4 × 0308 × 0E01 ÷   
#    returned: × 00B4 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] ACUTE ACCENT (BB) × [9.0] COMBINING DIAERESIS (CM1_CM) × [21.04] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 000B ÷ 0308 × 0E01 ÷  #  × [0.3] <LINE TABULATION> (BK) ÷ [4.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 883 failed
#    expected: × 000B ÷ 0308 × 0E01 ÷   
#    returned: × 000B ÷ 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] <LINE TABULATION> (BK) ÷ [4.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 000D ÷ 0308 × 0E01 ÷  #  × [0.3] <CARRIAGE RETURN (CR)> (CR) ÷ [5.02] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 1399 failed
#    expected: × 000D ÷ 0308 × 0E01 ÷   
#    returned: × 000D ÷ 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] <CARRIAGE RETURN (CR)> (CR) ÷ [5.02] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 00A0 × 0E01 ÷ #  × [0.3] NO-BREAK SPACE (GL) × [12.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 1741 failed
#    expected: × 00A0 × 0E01 ÷  
#    returned: × 00A0 ÷ 0E01 ÷ 
#    comments:   × [0.3] NO-BREAK SPACE (GL) × [12.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 00A0 × 0308 × 0E01 ÷  #  × [0.3] NO-BREAK SPACE (GL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [12.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 1743 failed
#    expected: × 00A0 × 0308 × 0E01 ÷   
#    returned: × 00A0 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] NO-BREAK SPACE (GL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [12.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 002C × 0E01 ÷ #  × [0.3] COMMA (IS) × [29.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 2945 failed
#    expected: × 002C × 0E01 ÷  
#    returned: × 002C ÷ 0E01 ÷ 
#    comments:   × [0.3] COMMA (IS) × [29.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 002C × 0308 × 0E01 ÷  #  × [0.3] COMMA (IS) × [9.0] COMBINING DIAERESIS (CM1_CM) × [29.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 2947 failed
#    expected: × 002C × 0308 × 0E01 ÷   
#    returned: × 002C × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] COMMA (IS) × [9.0] COMBINING DIAERESIS (CM1_CM) × [29.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 000A ÷ 0308 × 0E01 ÷  #  × [0.3] <LINE FEED (LF)> (LF) ÷ [5.03] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 3635 failed
#    expected: × 000A ÷ 0308 × 0E01 ÷   
#    returned: × 000A ÷ 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] <LINE FEED (LF)> (LF) ÷ [5.03] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0085 ÷ 0308 × 0E01 ÷  #  × [0.3] <NEXT LINE (NEL)> (NL) ÷ [5.04] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 3807 failed
#    expected: × 0085 ÷ 0308 × 0E01 ÷   
#    returned: × 0085 ÷ 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] <NEXT LINE (NEL)> (NL) ÷ [5.04] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0030 × 0E01 ÷ #  × [0.3] DIGIT ZERO (NU) × [23.03] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4149 failed
#    expected: × 0030 × 0E01 ÷  
#    returned: × 0030 ÷ 0E01 ÷ 
#    comments:   × [0.3] DIGIT ZERO (NU) × [23.03] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0030 × 0308 × 0E01 ÷  #  × [0.3] DIGIT ZERO (NU) × [9.0] COMBINING DIAERESIS (CM1_CM) × [23.03] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4151 failed
#    expected: × 0030 × 0308 × 0E01 ÷   
#    returned: × 0030 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] DIGIT ZERO (NU) × [9.0] COMBINING DIAERESIS (CM1_CM) × [23.03] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 2329 × 0E01 ÷ #  × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4321 failed
#    expected: × 2329 × 0E01 ÷  
#    returned: × 2329 ÷ 0E01 ÷ 
#    comments:   × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 2329 × 0020 × 0E01 ÷  #  × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [7.01] SPACE (SP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4322 failed
#    expected: × 2329 × 0020 × 0E01 ÷   
#    returned: × 2329 × 0020 ÷ 0E01 ÷ 
#    comments:   × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [7.01] SPACE (SP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 2329 × 0308 × 0E01 ÷  #  × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [9.0] COMBINING DIAERESIS (CM1_CM) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4323 failed
#    expected: × 2329 × 0308 × 0E01 ÷   
#    returned: × 2329 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [9.0] COMBINING DIAERESIS (CM1_CM) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 2329 × 0308 × 0020 × 0E01 ÷   #  × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [9.0] COMBINING DIAERESIS (CM1_CM) × [7.01] SPACE (SP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4324 failed
#    expected: × 2329 × 0308 × 0020 × 0E01 ÷    
#    returned: × 2329 × 0308 × 0020 ÷ 0E01 ÷ 
#    comments:   × [0.3] LEFT-POINTING ANGLE BRACKET (OP) × [9.0] COMBINING DIAERESIS (CM1_CM) × [7.01] SPACE (SP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0025 × 0E01 ÷ #  × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4493 failed
#    expected: × 0025 × 0E01 ÷  
#    returned: × 0025 ÷ 0E01 ÷ 
#    comments:   × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0025 × 0308 × 0E01 ÷  #  × [0.3] PERCENT SIGN (PO) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4495 failed
#    expected: × 0025 × 0308 × 0E01 ÷   
#    returned: × 0025 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] PERCENT SIGN (PO) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0024 × 0E01 ÷ #  × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4665 failed
#    expected: × 0024 × 0E01 ÷  
#    returned: × 0024 ÷ 0E01 ÷ 
#    comments:   × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0024 × 0308 × 0E01 ÷  #  × [0.3] DOLLAR SIGN (PR) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4667 failed
#    expected: × 0024 × 0308 × 0E01 ÷   
#    returned: × 0024 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] DOLLAR SIGN (PR) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0022 × 0308 × 0E01 ÷  #  × [0.3] QUOTATION MARK (QU) × [9.0] COMBINING DIAERESIS (CM1_CM) × [19.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4839 failed
#    expected: × 0022 × 0308 × 0E01 ÷   
#    returned: × 0022 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] QUOTATION MARK (QU) × [9.0] COMBINING DIAERESIS (CM1_CM) × [19.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0020 ÷ 0308 × 0E01 ÷  #  × [0.3] SPACE (SP) ÷ [18.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 5011 failed
#    expected: × 0020 ÷ 0308 × 0E01 ÷   
#    returned: × 0020 ÷ 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] SPACE (SP) ÷ [18.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 2060 × 0E01 ÷ #  × [0.3] WORD JOINER (WJ) × [11.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 5353 failed
#    expected: × 2060 × 0E01 ÷  
#    returned: × 2060 ÷ 0E01 ÷ 
#    comments:   × [0.3] WORD JOINER (WJ) × [11.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 2060 × 0308 × 0E01 ÷  #  × [0.3] WORD JOINER (WJ) × [9.0] COMBINING DIAERESIS (CM1_CM) × [11.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 5355 failed
#    expected: × 2060 × 0308 × 0E01 ÷   
#    returned: × 2060 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] WORD JOINER (WJ) × [9.0] COMBINING DIAERESIS (CM1_CM) × [11.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 200B ÷ 0308 × 0E01 ÷  #  × [0.3] ZERO WIDTH SPACE (ZW) ÷ [8.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 5527 failed
#    expected: × 200B ÷ 0308 × 0E01 ÷   
#    returned: × 200B ÷ 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] ZERO WIDTH SPACE (ZW) ÷ [8.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0029 × 0E01 ÷ #  × [0.3] RIGHT PARENTHESIS (CP_CP30) × [30.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6213 failed
#    expected: × 0029 × 0E01 ÷  
#    returned: × 0029 ÷ 0E01 ÷ 
#    comments:   × [0.3] RIGHT PARENTHESIS (CP_CP30) × [30.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0029 × 0308 × 0E01 ÷  #  × [0.3] RIGHT PARENTHESIS (CP_CP30) × [9.0] COMBINING DIAERESIS (CM1_CM) × [30.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6215 failed
#    expected: × 0029 × 0308 × 0E01 ÷   
#    returned: × 0029 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] RIGHT PARENTHESIS (CP_CP30) × [9.0] COMBINING DIAERESIS (CM1_CM) × [30.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0028 × 0308 × 0E01 ÷  #  × [0.3] LEFT PARENTHESIS (OP_OP30) × [9.0] COMBINING DIAERESIS (CM1_CM) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6387 failed
#    expected: × 0028 × 0308 × 0E01 ÷   
#    returned: × 0028 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] LEFT PARENTHESIS (OP_OP30) × [9.0] COMBINING DIAERESIS (CM1_CM) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0028 × 0308 × 0020 × 0E01 ÷   #  × [0.3] LEFT PARENTHESIS (OP_OP30) × [9.0] COMBINING DIAERESIS (CM1_CM) × [7.01] SPACE (SP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6388 failed
#    expected: × 0028 × 0308 × 0020 × 0E01 ÷    
#    returned: × 0028 × 0308 × 0020 ÷ 0E01 ÷ 
#    comments:   × [0.3] LEFT PARENTHESIS (OP_OP30) × [9.0] COMBINING DIAERESIS (CM1_CM) × [7.01] SPACE (SP) × [14.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0001 × 0E01 ÷ #  × [0.3] <START OF HEADING> (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6557 failed
#    expected: × 0001 × 0E01 ÷  
#    returned: × 0001 ÷ 0E01 ÷ 
#    comments:   × [0.3] <START OF HEADING> (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0001 × 0308 × 0E01 ÷  #  × [0.3] <START OF HEADING> (CM1_CM) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6559 failed
#    expected: × 0001 × 0308 × 0E01 ÷   
#    returned: × 0001 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] <START OF HEADING> (CM1_CM) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 200D × 0E01 ÷ #  × [0.3] ZERO WIDTH JOINER (ZWJ_O_ZWJ_CM) × [8.1] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6729 failed
#    expected: × 200D × 0E01 ÷  
#    returned: × 200D ÷ 0E01 ÷ 
#    comments:   × [0.3] ZERO WIDTH JOINER (ZWJ_O_ZWJ_CM) × [8.1] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 200D × 0308 × 0E01 ÷  #  × [0.3] ZERO WIDTH JOINER (ZWJ_O_ZWJ_CM) × [8.1] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6731 failed
#    expected: × 200D × 0308 × 0E01 ÷   
#    returned: × 200D × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] ZERO WIDTH JOINER (ZWJ_O_ZWJ_CM) × [8.1] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 00A7 × 0E01 ÷ #  × [0.3] SECTION SIGN (AI_AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6901 failed
#    expected: × 00A7 × 0E01 ÷  
#    returned: × 00A7 ÷ 0E01 ÷ 
#    comments:   × [0.3] SECTION SIGN (AI_AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 00A7 × 0308 × 0E01 ÷  #  × [0.3] SECTION SIGN (AI_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6903 failed
#    expected: × 00A7 × 0308 × 0E01 ÷   
#    returned: × 00A7 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] SECTION SIGN (AI_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 50005 × 0E01 ÷    #  × [0.3] <reserved-50005> (XX_AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7073 failed
#    expected: × 50005 × 0E01 ÷ 
#    returned: × 50005 ÷ 0E01 ÷ 
#    comments:   × [0.3] <reserved-50005> (XX_AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 50005 × 0308 × 0E01 ÷ #  × [0.3] <reserved-50005> (XX_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7075 failed
#    expected: × 50005 × 0308 × 0E01 ÷  
#    returned: × 50005 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] <reserved-50005> (XX_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0E01 × 0023 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [28.0] NUMBER SIGN (AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7081 failed
#    expected: × 0E01 × 0023 ÷  
#    returned: × 0E01 ÷ 0023 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [28.0] NUMBER SIGN (AL) ÷ [0.3]

# Parsing line: × 0E01 × 0308 × 0023 ÷  #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] NUMBER SIGN (AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7083 failed
#    expected: × 0E01 × 0308 × 0023 ÷   
#    returned: × 0E01 × 0308 ÷ 0023 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] NUMBER SIGN (AL) ÷ [0.3]

# Parsing line: × 0E01 × 0030 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7173 failed
#    expected: × 0E01 × 0030 ÷  
#    returned: × 0E01 ÷ 0030 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]

# Parsing line: × 0E01 × 0308 × 0030 ÷  #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [23.02] DIGIT ZERO (NU) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7175 failed
#    expected: × 0E01 × 0308 × 0030 ÷   
#    returned: × 0E01 × 0308 ÷ 0030 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [23.02] DIGIT ZERO (NU) ÷ [0.3]

# Parsing line: × 0E01 × 0025 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] PERCENT SIGN (PO) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7181 failed
#    expected: × 0E01 × 0025 ÷  
#    returned: × 0E01 ÷ 0025 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] PERCENT SIGN (PO) ÷ [0.3]

# Parsing line: × 0E01 × 0308 × 0025 ÷  #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.03] PERCENT SIGN (PO) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7183 failed
#    expected: × 0E01 × 0308 × 0025 ÷   
#    returned: × 0E01 × 0308 ÷ 0025 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.03] PERCENT SIGN (PO) ÷ [0.3]

# Parsing line: × 0E01 × 0024 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] DOLLAR SIGN (PR) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7185 failed
#    expected: × 0E01 × 0024 ÷  
#    returned: × 0E01 ÷ 0024 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] DOLLAR SIGN (PR) ÷ [0.3]

# Parsing line: × 0E01 × 0308 × 0024 ÷  #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.03] DOLLAR SIGN (PR) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7187 failed
#    expected: × 0E01 × 0308 × 0024 ÷   
#    returned: × 0E01 × 0308 ÷ 0024 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [24.03] DOLLAR SIGN (PR) ÷ [0.3]

# Parsing line: × 0E01 × 0028 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [30.01] LEFT PARENTHESIS (OP_OP30) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7225 failed
#    expected: × 0E01 × 0028 ÷  
#    returned: × 0E01 ÷ 0028 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [30.01] LEFT PARENTHESIS (OP_OP30) ÷ [0.3]

# Parsing line: × 0E01 × 0308 × 0028 ÷  #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [30.01] LEFT PARENTHESIS (OP_OP30) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7227 failed
#    expected: × 0E01 × 0308 × 0028 ÷   
#    returned: × 0E01 × 0308 ÷ 0028 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [30.01] LEFT PARENTHESIS (OP_OP30) ÷ [0.3]

# Parsing line: × 0E01 × 0308 × 0E01 ÷  #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7247 failed
#    expected: × 0E01 × 0308 × 0E01 ÷   
#    returned: × 0E01 × 0308 ÷ 0E01 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [9.0] COMBINING DIAERESIS (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0E01 × 0030 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7447 failed
#    expected: × 0E01 × 0030 ÷  
#    returned: × 0E01 ÷ 0030 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]

# Parsing line: × 0024 × 0E01 ÷ #  × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7449 failed
#    expected: × 0024 × 0E01 ÷  
#    returned: × 0024 ÷ 0E01 ÷ 
#    comments:   × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0025 × 0E01 ÷ #  × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7450 failed
#    expected: × 0025 × 0E01 ÷  
#    returned: × 0025 ÷ 0E01 ÷ 
#    comments:   × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
thep commented 3 years ago

With the current TIS-620-based model of LibThai, it's impossible to cope with cases of Unicode code points outside {U+0000..U+007F, U+0E01..U+0E5B}. So, the cases we can handle for now are:

# Parsing line: × 0023 × 0E01 ÷ #  × [0.3] NUMBER SIGN (AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 193 failed
#    expected: × 0023 × 0E01 ÷  
#    returned: × 0023 ÷ 0E01 ÷ 
#    comments:   × [0.3] NUMBER SIGN (AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 002C × 0E01 ÷ #  × [0.3] COMMA (IS) × [29.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 2945 failed
#    expected: × 002C × 0E01 ÷  
#    returned: × 002C ÷ 0E01 ÷ 
#    comments:   × [0.3] COMMA (IS) × [29.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0030 × 0E01 ÷ #  × [0.3] DIGIT ZERO (NU) × [23.03] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4149 failed
#    expected: × 0030 × 0E01 ÷  
#    returned: × 0030 ÷ 0E01 ÷ 
#    comments:   × [0.3] DIGIT ZERO (NU) × [23.03] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0025 × 0E01 ÷ #  × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4493 failed
#    expected: × 0025 × 0E01 ÷  
#    returned: × 0025 ÷ 0E01 ÷ 
#    comments:   × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0024 × 0E01 ÷ #  × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 4665 failed
#    expected: × 0024 × 0E01 ÷  
#    returned: × 0024 ÷ 0E01 ÷ 
#    comments:   × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0029 × 0E01 ÷ #  × [0.3] RIGHT PARENTHESIS (CP_CP30) × [30.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6213 failed
#    expected: × 0029 × 0E01 ÷  
#    returned: × 0029 ÷ 0E01 ÷ 
#    comments:   × [0.3] RIGHT PARENTHESIS (CP_CP30) × [30.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0001 × 0E01 ÷ #  × [0.3] <START OF HEADING> (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 6557 failed
#    expected: × 0001 × 0E01 ÷  
#    returned: × 0001 ÷ 0E01 ÷ 
#    comments:   × [0.3] <START OF HEADING> (CM1_CM) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0E01 × 0023 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [28.0] NUMBER SIGN (AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7081 failed
#    expected: × 0E01 × 0023 ÷  
#    returned: × 0E01 ÷ 0023 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [28.0] NUMBER SIGN (AL) ÷ [0.3]

# Parsing line: × 0E01 × 0030 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7173 failed
#    expected: × 0E01 × 0030 ÷  
#    returned: × 0E01 ÷ 0030 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]

# Parsing line: × 0E01 × 0025 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] PERCENT SIGN (PO) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7181 failed
#    expected: × 0E01 × 0025 ÷  
#    returned: × 0E01 ÷ 0025 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] PERCENT SIGN (PO) ÷ [0.3]

# Parsing line: × 0E01 × 0024 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] DOLLAR SIGN (PR) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7185 failed
#    expected: × 0E01 × 0024 ÷  
#    returned: × 0E01 ÷ 0024 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [24.03] DOLLAR SIGN (PR) ÷ [0.3]

# Parsing line: × 0E01 × 0028 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [30.01] LEFT PARENTHESIS (OP_OP30) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7225 failed
#    expected: × 0E01 × 0028 ÷  
#    returned: × 0E01 ÷ 0028 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [30.01] LEFT PARENTHESIS (OP_OP30) ÷ [0.3]

# Parsing line: × 0E01 × 0030 ÷ #  × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7447 failed
#    expected: × 0E01 × 0030 ÷  
#    returned: × 0E01 ÷ 0030 ÷ 
#    comments:   × [0.3] THAI CHARACTER KO KAI (SA_AL) × [23.02] DIGIT ZERO (NU) ÷ [0.3]

# Parsing line: × 0024 × 0E01 ÷ #  × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7449 failed
#    expected: × 0024 × 0E01 ÷  
#    returned: × 0024 ÷ 0E01 ÷ 
#    comments:   × [0.3] DOLLAR SIGN (PR) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

# Parsing line: × 0025 × 0E01 ÷ #  × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 7450 failed
#    expected: × 0025 × 0E01 ÷  
#    returned: × 0025 ÷ 0E01 ÷ 
#    comments:   × [0.3] PERCENT SIGN (PO) × [24.02] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
thep commented 3 years ago
# Parsing line: × 0023 × 0E01 ÷ #  × [0.3] NUMBER SIGN (AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]
# /home/thep/vcs/gnome_gitlab/pango/tests/LineBreakTest.txt: line 193 failed
#    expected: × 0023 × 0E01 ÷  
#    returned: × 0023 ÷ 0E01 ÷ 
#    comments:   × [0.3] NUMBER SIGN (AL) × [28.0] THAI CHARACTER KO KAI (SA_AL) ÷ [0.3]

While I can imagine a reason of prohibiting break between '#' and a Thai letter, probably in hash tags, I think allowing break between AL and SA_AL makes sense in general, as a result of morphological analysis for SA class. For example:

Helloสวัสดีชาวโลก

could be analyzed as:

Hello|สวัสดี|ชาว|โลก

As stated in UAX#14:

Therefore complex context analysis, often involving dictionary lookup of some form, is required to determine non-emergency line breaks. If such analysis is not available, it is recommended to treat them as AL.

In case morphological analysis is not available, treating Thai letters as AL might be OK. But when LibThai analysis is available for Pango, I think the added break opportunities should not break UAX#14, either.

Unless '#' is classified differently from AL, I think fixing this case would result in undesirable side-effects.

epico commented 3 years ago

I think the failed test cases happens between ASCII characters and Thai characters boundary.

The reference test case is produced without dictionary lookup. Maybe UAX#14 allows different implementations with different test results.

Anyway I am not native speaker. If you think the current libthai behavior is correct, I am okay with disabling line break test case for now.

thep commented 3 years ago

I've been checking UAX#14 compliance and have been fixing incompatibilities in branch uax14. In the end, I think not all failed cases will be fixed. I just try to fix as much as I find appropriate.

epico commented 3 years ago

Thanks, I think it is good to reduce the number of failed test cases.