naiithink / latex-th

Containerized (Xe)LaTeX development environment
https://hub.docker.com/r/naiithink/latex-th
GNU General Public License v3.0
0 stars 0 forks source link

Line spacing and line breaking #1

Open psherman42 opened 9 months ago

psherman42 commented 9 months ago

Glad to see you use polyglossia and fontspec packages, as well as this line: https://github.com/naiithink/latex-th/blob/6d67962a3b02b0c27a851a295965757677a6e4d0/_test/hello.tex#L25 Would this help better line breaking? What might be better choice for the parameters?

\XeTeXlinebreakskip = 0em minus 0.25em

How do you handle full justification, or do you prefer an un-justified look with no hyphenation?

\usepackage{ragged2e}
\setlength\RaggedRightParindent{\parindent} % default `0pt`
\RaggedRight % raggedright, raggedleft, centering

\tolerance=9999
\emergencystretch=10pt
\hyphenpenalty=10000
\exhyphenpenalty=100

What is your preference for whitespaces before ๆ and ()?

คืนค่า (Return) or คืนค่า(Return)
อื่น ๆ or อื่นๆ

If spaces are used, should they be non-breaking like ~ to prevent orphans? I am thinking of this style guide. Other very good test cases are in the W3C Gap Analysis

naiithink commented 9 months ago

Hi, thank you for asking.

Please give me some more time to review the links you provided before answering some of your questions in-depth.

Would this help better line breaking? What might be better choice for the parameters?

\XeTeXlinebreakskip = 0em minus 0.25em

I just tried this with several Thai documents. Seems like it won't be able to break lines in-sentence, leaving long sentences to overflow the page. I further read the XeTeX-notes, it states that Thai language has additional rules for line-breaking that only apply when using the locale macro (please correct me if I am wrong) so that might be the reason to go for XeTeXlinebreaklocale. But I do see the advantages of using XeTeXlinebreakskip as it helps improve the spacing consistency throughout different languages.

How do you handle full justification, or do you prefer an un-justified look with no hyphenation?

\usepackage{ragged2e}
\setlength\RaggedRightParindent{\parindent} % default `0pt`
\RaggedRight % raggedright, raggedleft, centering

\tolerance=9999
\emergencystretch=10pt
\hyphenpenalty=10000
\exhyphenpenalty=100

To fully justify paragraphs:

\sloppy
\justifying

I prefer left-aligned and no hyphenations since it gives more natural-looking. But formal documents usually require justifications on both sides, as well as indentation at the beginning of every paragraph.

What is your preference for whitespaces before ๆ and ()?

คืนค่า (Return) or คืนค่า(Return)
อื่น ๆ or อื่นๆ

I prefer

คืนค่า_(Return)_ -- always
อื่นๆ_ -- when writing casually, otherwise it is อื่น_ๆ_ 

อื่น_ๆ_ is a formal and grammatical way to write this punctuation according to orst.go.th (see this and this -- all in Thai). There has been a lot of controversy over this. We have some sort of guideline or standard describing spacing format (the orst.go.th site) and it suggests _ๆ_ . But I do not see people are adopting it.

I saw that apple.com is currently using อื่นๆ_.

image

I use these GAWK scripts I wrote to format some Thai and English punctuation marks before compiling the final draft:

literals.gawk.txt thai-literals.gawk.txt

naiithink commented 9 months ago

I think the ๆ spacing problem could be solved if we could somehow alter the glyph by adding spaces surrounding the glyph itself. Maybe making a new font face or defining an entirely new standard.