Closed lyndondrake closed 1 year ago
Works for me, with FreeSerif (although this font misplaces some cantillation marks) and Arial. Please, provide a minimal example. Which version are you using?
As to auto-detecting the paragraph direction, it's not usually a good idea when there is an explicit markup, only with plain text (case of Emacs, whose bidi algorithm I studied for babel
). Not that I reject the idea, but it's not trivial. See for example Additional Requirements for Bidi in HTML & CSS.
Sorry about that - turns out if I copy and paste from Logos, it works. Copy/paste from Accordance doesn't. So there must be some extra invisible characters in the Accordance export :-( Apologies for the non-issue, but your confirmation that it worked for you made me try something different. I'm super impressed with the overall workability of the automatic switching!
I'm generally working from a plain text file (either org-mode or Pandoc-flavoured Markdown). The reason is that I find it easier to write like that. With Pandoc, I can put a <div lang="he">
around the Hebrew paragraphs and they get transformed nicely.
For org-mode, it's not as obvious how to go about it. I can probably just put the LaTeX environment in and it should be carried over in the LaTeX export.
I don't know how to check the babel version, and I've attached my not-entirely-minimal test file (you could drop my font out). lualatex-hebrew-test.tex.txt lualatex-hebrew-test.pdf
I did just note that I have bidi=basic
in my babel load. Is that what I should be using?
And having read through that document, could there potentially be an option provided to babel that (effectively) sets the equivalent of dir="auto"
for paragraphs that don't have an explicit language environment? That way the default behaviour could be left alone, and lazy typists like me could try to avoid marking paragraphs directly.
You are welcome. I'm not sure if it's a task for babel
or the converter from Org or Markdown, but it's worth investigating (not in the short term, I'm afraid).
The W3C still discourages the use of dir="auto"
, which is left as a last resort. With LaTeX we must know on beforehand how to deal with boxes and the like, and once the node list has been created it cannot be “reversed”, which means the document must be preprocessed before typesetting it. This is out of the scope of babel.
The automatic language switching feature is awesome. With Hebrew, I've run into a little problem though, to do with multiple combining marks. The following works fine, although the paragraph is still set as a LRT paragraph:
But this fails:
The more verbose form works perfectly, but is much harder to read:
A number of other applications can set the second example directly (e.g. Mellel, even Emacs strangely enough!).
I'm guessing it is something to do with detecting which code points are combining marks that mark the combined character as Hebrew?
V happy to test this.
Also (maybe needs to be a separate issue), there's a bidi algorithm which e.g. Emacs uses to determine whether a paragraph in an LTR document is RTL. So that second example gets marked as a RTL paragraph. I'm assuming I could do something similar with a Unicode RTL marker, perhaps?