Closed u-fischer closed 4 years ago
Thank you for catching the arguments issue. as for /S, it is due to an issue with screen readers. Otherwise, the ActualText is not read
Can you try instead: /S MP/Span << ....
which is valid PDF. (MP
operator is "marked point")
Does this trigger the screen reader for you?
Furthermore, since the content is mathematics, it would be better to use /Formula
instead of /Span
, and duplicate the /ActualText
string also as /Alt
text.
Inspecting the NDVA source coding, it should be sufficient to be using use /Alt
alone, once the validity of the PDF syntax has been sorted out. Would you please try this?
Hope this helps.
Ross Moore
(Director, TeX Users Group)
If the suggestions of Ross work it would be better to implement them without patching and changing internal commands of Accsupp as this can introduce incompabilities with other uses of Accsupp.
Dear @u-fischer,about your last comment: would this be ok?
\newcommand*{\BeginAxessible}[1]{%
\begingroup
\setkeys{ACCSUPP}{#1}%
\edef\ACCSUPP@span{%
/S/Formula<<%
\ifx\ACCSUPP@Alt\relax
\else
/Alt\ACCSUPP@Alt
\fi
\ifx\ACCSUPP@ActualText\relax
\else
/ActualText\ACCSUPP@ActualText
\fi
>>%
}%
\ACCSUPP@bdc
\ACCSUPP@space
\endgroup
}
and
\newcommand*{\EndAxessible}{%
\begingroup
\ACCSUPP@emc
\endgroup
}
The wrapper becomes:
\long\def\wrap#1{
\BeginAxessible{method=escape,ActualText=\detokenize\expandafter{#1}, Alt=\detokenize\expandafter{#1} }
#1
\EndAxessible%
}
If so, we will run some test tomorrow and update the code accordingly.
Thanks for your suggestion!
PS. We also included @ozross idea of having both /Alt
and /ActualText
. Unfortunately, the proposed suggestion /S MP/Span
or /S MP/Formula
does not trigger the screen reader.
That's certainly better than patching accsupp, but you are still producing invalid pdf.
I made a few short tests and can confirm the odd behaviour (also with the text-to-speech engine of adobe): the latex /Alt-Text is only read when you create a faulty pdf.
Reading the reference and after some more tests I suspect that the /Alt-Text is used only if the document if fully tagged.
Dear @u-fischer, we are working on the matter. We will keep you posted. Our main goal was to make the PDF readable to visually impaired people, and that (even with the error) works. We will use this issue as a feature request (or even as the starting point of a brand new implementation). Thanks a lot!
I made a few tests with /Alt and /ActualText and text-to-speech software. A sum up is here https://github.com/u-fischer/tagpdf/blob/master/source/examples/structure/ex-alt-actualtext.tex (it needs the new tagpdf.sty I uploaded yesterday to CTAN to compile, but the resulting pdfs from pdflatex and lualatex are also in the github folder).
Hi Ulrike,
On 8 Aug 2018, at 7:55 am, u-fischer notifications@github.com<mailto:notifications@github.com> wrote:
I made a few tests with /Alt and /ActualText and text-to-speech software. A sum up is here https://github.com/u-fischer/tagpdf/blob/master/source/examples/structure/ex-alt-actualtext.texhttps://protect-au.mimecast.com/s/kkK0CJyBZ6tqDqmRcV5Z8s?domain=github.com (it needs the new tagpdf.sty I uploaded yesterday to CTAN to compile, but the resulting pdfs from pdflatex and lualatex are also in the github folder).
One of the comments that I read in the source file(s) was that some math is not encoded properly. Try also using one or other (or both) of
\input glyphtounicode-cmr.tex (from the pdfx package)
and
\usepackage{mmap}
The latter package is quite old, but it provides CMap resources for OT1 and T1-encoded math fonts. I have a later version which I can send you if needed. Iβm working on a further update, hopefully to be usable also with XeTeX.
These packages and files were developed specifically to overcome problems encountered when generating documents to be compliant with PDF/A, particularly for the PDF/A-2 and PDF/A-3 levels, using pdfTeX.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://protect-au.mimecast.com/s/DyczCK1DOrC2G2vlhvirNs?domain=github.com, or mute the threadhttps://protect-au.mimecast.com/s/CBc0CL7Eg9fRZRpACP5PSq?domain=github.com.
Hope this helps.
Ross
Dr Ross Moore
Mathematics Dept | 12 Wallyβs Walk, 734 Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955 | F: +61 2 9850 8114<tel:%2B61%202%209850%209695> M:+61 407 288 255<tel:%2B61%20409%20125%20670> | E: ross.moore@mq.edu.aumailto:rick.minter@mq.edu.au
http://www.maths.mq.edu.auhttp://mq.edu.au/
[cid:image001.png@01D030BE.D37A46F0]http://mq.edu.au/
CRICOS Provider Number 00002J. Think before you print. Please consider the environment before printing this email.http://mq.edu.au/
This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of Macquarie University.http://mq.edu.au/
@ozross I know about mmap but didn't care for this tests. But beside this: one problem with mmap is that it doesn't work with other math fonts, e.g. newtxmath. The other that it only improves copy&paste of symbols. E.g.
\[\sum_{i=1}^n (2^x+2) = \sqrt{abc}\]
copies as
\sum n
i=1
(2x + 2) =
\surd
abc
I don't see much use to get it working with xelatex. One can simply use unicode-math. Then one can copy unicode symbols (but don't get structure either):
πΞ£
π=1
(2π₯ + 2) =
β
πππ
Hi Ulrike,
On 08/08/2018, at 19:07, "u-fischer" notifications@github.com<mailto:notifications@github.com> wrote:
@ozrosshttps://protect-au.mimecast.com/s/y6GcCANpnDCNzL8NF82yuQ?domain=github.com I know about mmap but didn't care for this tests. But beside this: one problem with mmap is that it doesn't work with other math fonts, e.g. newtxmath.
I can produce the cmap files for this. Just as I am doing now for MathTimePro-2.
The other that it only improves copy&paste of symbols. E.g.
[\sum_{i=1}^n (2^x+2) = \sqrt{abc}]
copies as
\sum n i=1 (2x + 2) = \surd abc
There are multiple modes. This one using macro names is just one of the modes.
I don't see much use to get it working with xelatex. One can simply use unicode-math.
People like mixing some of the features of XeTeX with legacy math fonts, such as Lucida and MathTime. There should be no reason to not do this, and such documents should be able to be made to satisfy PDF/A, and perhaps also PDF/UA.
Then one can copy unicode symbols (but don't get structure either):
πΞ£ π=1 (2π₯ + 2) = β πππ
Sure. Structure in math is much, much harder. It should be done using MATHML tagging. But that is really hard, at present.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://protect-au.mimecast.com/s/t5ZmCzvkmpfMAQ7MC4rdWg?domain=github.com, or mute the threadhttps://protect-au.mimecast.com/s/kuAcCBNqgBC7Opy7fNOz0y?domain=github.com.
Cheers,
Ross
Hi again,
On 09/08/2018, at 0:09, "Ross Moore" ross.moore@mq.edu.au<mailto:ross.moore@mq.edu.au> wrote: On 08/08/2018, at 19:07, "u-fischer" notifications@github.com<mailto:notifications@github.com> wrote:
@ozrosshttps://protect-au.mimecast.com/s/y6GcCANpnDCNzL8NF82yuQ?domain=github.com I know about mmap but didn't care for this tests. But beside this: one problem with mmap is that it doesn't work with other math fonts, e.g. newtxmath.
Presumably the glyphs are named correctly, so the automated methods of \pdfglyphtounicode should work for this, and other modern fonts, except for maybe a few characters not yet listed in glyphtounicode.tex . So mmap.sty doesn't have much to add for recent fonts.
I can produce the cmap files for this. Just as I am doing now for MathTimePro-2.
MathTimePro, on the other hand, is much older. It does not name glyphs according to the actual character, so automated methods do not work. Instead it needs a CMAP constructed specially, to correctly identify each character. And it needs a method to associate the CMAP with the font dictionary within the PDF. This is the case with any driver: pdfTeX, XeTeX, or LuaTeX as well as dvips.
The other that it only improves copy&paste of symbols. E.g.
[\sum_{i=1}^n (2^x+2) = \sqrt{abc}]
copies as
\sum n i=1 (2x + 2) = \surd abc
I don't see much use to get it working with xelatex. One can simply use unicode-math.
People like mixing some of the features of XeTeX with legacy math fonts, such as Lucida and MathTime. There should be no reason to not do this, and such documents should be able to be made to satisfy PDF/A, and perhaps also PDF/UA.
Then one can copy unicode symbols (but don't get structure either):
πΞ£ π=1 (2π₯ + 2) = β πππ
Sure. Structure in math is much, much harder. It should be done using MATHML tagging. But that is really hard, at present.
You have seen some of my earlier work producing documents in which each symbol is tagged with both /Alt and /ActualText. It was done in such a way as to build a natural language alternative description of mathematical expressions. This requires external software (e.g. Perl scripts) to combine a MathML translation capturing the structure, with the original TeX source of each math expression. This is work that I hope to be able to return to, for supporting PDF 2.0, and PDF/UA-2.
When we have this, blind users will be able to use MathPlayer as PDF browsing software.
Cheers, (from the sky above Helsinki)
Ross
When I compile the following with pdflatex
I get a pdf which doesn't pass a preflight test due to syntax errors:
Looking in the pdf it is quite clear what is wrong: the style create a BDC operator with three arguments instead of two:
I don't quite understand why the /S key is inserted here, this is normally a key for a StructElem object.