Closed bertfrees closed 5 years ago
This pass2 rule in sbs-special.mod might have something to do with it:
# Kürzungsverbot entfernen
pass2 @a ?
@egli Could you find out why this rule is there, and if possible replace it with something else?
Ah, this is probably because we add this "Kürzungsverbot" in the XSLT in some places to inhibit contraction.
Oh I see. And in the XSLT that special sign is U+250A, right? In sbs-special.cti there is a rule that says
letter \x250A a
The problem is that in compileTranslationTable.c there is a rule that says
space \x00A0 a
and this is also what I rely on. Virtual dot "a" should really be reserved for NBSP, otherwise things break.
So it seems the solution is simply to find another virtual dot pattern for the Kürzingsverbot.
Yes, probably makes sense. I seem to remember that Christian Waldvogel complained that there weren't enough virtual dots. I'll have to look at it with him or Mischa
There are plenty of virtual dots patterns. 6 virtual dots (9, a, b, c, d and e) which means (2^6 - 1) * 2^8 = 16128 virtual dot patterns.
This should be fixed in the pipeline2 branch
I couldn't build it, had to add a fixup (81e955f), I hope I got it right. The white space problem seems to be solved though. Thanks! Will you release a new version soon?
@egli The problem is the following. The mechanism I use for preserving white space is based on segmentation of the input, replacing significant white space segments with a NBSP character, tracing of these NBSP segments in the output back to the input, and restoring the white space segments if needed. But the accuracy of this mechanism relies on the liblouis table. The table should preserve NBSP characters (unless it has a good reason to delete them), and in addition it needs to support the segmentation (i.e. the input/output position mapping should be accurate).
I have a test where I want to translate the string
3. die Mittelsenkrechte auf der Strecke ⠷⠘⠉⠙⠾ im Punkt Q.
The space after the "3." and the spaces before and after the unicode braille string need to be preserved (they are NBSP but you can't see that in this Github issue). As you can see in the following warning message, all of the NBSP segments were lost: