liblouis / liblouisutdml

An open-source library providing complete braille transcription services for xml, html and text documents
http://liblouis.io
GNU General Public License v3.0
24 stars 16 forks source link

Migrate to liblouis 3.20 #76

Closed egli closed 2 years ago

egli commented 2 years ago

Basically replace all uplow opcodes with lowercase and base

egli commented 2 years ago

I fixed the uplow in all the pagenum.cti and that seems to have helped.

But now I have a regression in some nemeth math tests where the input contains capital letters. I could just mark them as xfail, but I'd rather correct them. If I compile with VERBOSE=1 make check I get

FAIL: mathml_nemeth/mover_09
============================

../../lbu_files/nemeth.sem:34: Action or style or macro 'matrix' in column 1 not recognized
Testing MathML to Nemeth braille.
Diff: 
--- /home/egli/src/liblouisutdml/tests/mathml_nemeth/mover_09.test/expected.txt 2020-11-02 11:00:28.262405605 +0100
+++ /tmp/fooQhqVnU/output.txt   2022-01-10 16:02:45.251482429 +0100
@@ -1 +1 @@
-",A,B<$A]
+",A,B⡘ "K  $A ⡘.)
FAIL mathml_nemeth/mover_09.test (exit status: 1)

FAIL: mathml_nemeth/mover_10
============================

../../lbu_files/nemeth.sem:34: Action or style or macro 'matrix' in column 1 not recognized
Testing MathML to Nemeth braille.
Diff: 
--- /home/egli/src/liblouisutdml/tests/mathml_nemeth/mover_10.test/expected.txt 2020-11-02 11:00:28.262405605 +0100
+++ /tmp/foooIFYw6/output.txt   2022-01-10 16:02:45.283482391 +0100
@@ -1 +1 @@
-",A,B<$O]
+",A,B⡘ "K  $O ⡘.)
FAIL mathml_nemeth/mover_10.test (exit status: 1)

@bertfrees any idea?

bertfrees commented 2 years ago

It seems rules in nemeth.ctb like always ^< 126 don't work when there are capitals. I don't know why.

egli commented 2 years ago

OK, thanks @bertfrees

I guess I'll just mark the tests as XFAIL then and leave this bug for someone else to find.

bertfrees commented 2 years ago

That makes little sense I think. XFAIL is not meant to ignore bugs.

egli commented 2 years ago

I understand that. But let's be honest: we are not maintaining liblouisutdml, we are just providing live support. I sort of made it work with the newest liblouis answering to issue #75 that @sthibaul reported.

What action would you suggest? That I go ahead and try to fix the problem with always ^< 126?

Which reminds me, I have an update for viewxml.ctb that I should push.

bertfrees commented 2 years ago

This could also indicate a problem in Liblouis. For that reason I've started to investigate.

bertfrees commented 2 years ago

This happens because of the checkEmphasisChange() function which is called when checking whether an always or word rule matches and which should return 1 when there is a change of emphasis or capitalisation somewhere inside the contraction. Apparently some change between Liblouis 3.17 and 3.20 makes this function return 1 where it didn't before.

bertfrees commented 2 years ago

The regression was caused by commit https://github.com/liblouis/liblouis/commit/beec4b105f7c44a972544785e710b3467b1d5eff: here I moved some code from resolveEmphasisAllSymbols() to insertEmphasisSymbol() but didn't think about checkEmphasisChange().

egli commented 2 years ago

Hm, interesting, so you found a bug in liblouis by persisting

sthibaul commented 2 years ago

Don't we also need to fix lbu_files/viewxml.ctb which still contains uplow?

egli commented 2 years ago

Don't we also need to fix lbu_files/viewxml.ctb which still contains uplow?

Yes probably. It doesn't seem to bother the test suite and I don't know what it is for, but yes this probably should be fixed.

sthibaul commented 2 years ago

Also, e.g. pagenum.cti is made to use only lowercase, but latinLetterDef8Dots.uti is made to use both lowercase and uppercase. Do we need to use the latter or the former approach?

egli commented 2 years ago

I basically just made the test work again. The pagenum stuff only seems to exercise lowercase chars. And the latinLetterDef8Dots.uti seems to be part of some Java binding test suite. I don't think that this is even run during the normal test suite. AFAIR I just copied the table from the liblouis sources.

So as far as I'm concerned the changes should be OK. They both are just made for the tests. So to answer your question I think it's OK to use different approaches for pagenum.cti and latinLetterDef8Dots.uti

sthibaul commented 2 years ago

Ok, so this should be fine for viewxml.ctb: patch.txt

egli commented 2 years ago

So as far as I'm concerned this patch could go in. Sure it masks the liblouis regression (liblouis/liblouis#1152) but utdml test suite now passes and we can create a release that will work with liblouis 3.20. Once this is fixed in liblouis there will be unexpected passes but that is OK.

So @bertfrees what do you say? Ready to merge?

bertfrees commented 2 years ago

@egli Hmm, I still don't really like the XFAILs. But if there is no other way...

egli commented 2 years ago

If we do not use XFAILS then there will be no utdml release that works with liblouis 3.20 as far as I know. The original complaint was that the test suite was failing. This was mostly due to the uplow incompatibility, but also because of the regression. The regression will be fixed by applying a patch to liblouis or waiting for liblouis 3.21. But thb I feel a bit weird releasing a utdml release that only works with a patched liblouis.

OTOH I guess it is also weird to release utdml with silent regressions. But hopefully this is a temporary situation.