latex3 / hyperref

Hypertext support for LaTeX
169 stars 36 forks source link

hyperref generated PDF forms can not be imported correctly using poppler #3

Open kberry opened 8 years ago

kberry commented 8 years ago

(import from https://puszcza.gnu.org.ua/bugs/?272) or Mupdf, so there are to my knowledge no PDF viewers left in Linux, which could work on them. Fortunately, the word processor of Libreoffice 5.1.0 seems to spit out ODFs in a format, which is exported by poppler in a way, that it will later understand.

I have attached the uncompressed pdf output for comparison (LOFormWin-uncomp.pdf), so adapting hyperref output to poppler’s suggestions would most likely fix the issue. Could not see any differences in acroread or Windows readers.

Reproducible: Always

Steps to reproduce:

  1. pdflatex .tex
  2. Open the resulting pdf in either qpdfview, okular, evince, mupdf, pdf-tools (emacs), doc-view (emacs) or inkscape (poppler-import)

Expected results: Radio Buttons deselect, when clicking another. Checkbox may be (de-)activated. Rendering should be correct for the last three ones (no form support).

Actual results: Second Radio Button is rendered as \ding{123} (all in docView). Nothing is rendered in mupdf. Clicking another Radio Button will not deselect the others. \ding{123} will show up when clicking the invisible Radio Buttons in mupdf. The checkbox can not be activated.

I used the following minimal example to investigate the issue:

\pdfminorversion=7 \documentclass[a4paper]{article} \usepackage[english]{babel} \usepackage[utf8]{inputenc} \usepackage[pdftex]{hyperref}

\begin{document} \begin{Form} \ChoiceMenu[radio,radiosymbol=\ding{108},name={myGroupOfRadiobuttons}]{Multi}{A, B} \ChoiceMenu[radio,radiosymbol=\ding{108},name={myGroupOfRadiobuttons}]{Single}{C} \medskip

\CheckBox[name=checkbox, checkboxsymbol=\ding{110}]{Checkbox} \bigskip

\TextField[name=textfield,altname=textfield, width=0.6\textwidth]{TextField} \end{Form} \end{document}

Additional Information:

Resetting the form data in “Master PDF Editor” will get the checkbox to work everywhere and the radio buttons in mupdf. Attached the diff between those two as ‘bugexMPEres.diff’ Simply saving it in “Win Acrobat Reader” has the same effect. Saving it in “Foxit Reader” will replace the \ding{123} signs, but has otherwise no effect. The identical effect is achieved by not resetting the form in MPE.

Inkscape has a nice checkbox to toggle poppler import. Enabling it, leads to exactly the same appearance. Otherwise, just the text is shown, which excludes the viewers from influencing the issue. (shown in lininkevi.png; left inkscape, right evince).

The last one shows pdf-tools left, which has no form support, but at least recognizes the mid radio button as annotation, and mupdf right, which selects the buttons as \ding{123}.

Other bugs involved:

Of course poppler’s pdf import could be improved, but since its count of PDF form bugs is already quite high, filing another one would be to no avail.

Linux Libreoffice 5.0.4 could not create an odt, which could export to working PDF forms. I created that example one with LO 5.1.0 in a Windows VM. Once it is packaged on Gentoo, I’ll retest that to see if I get different results.

Mupdf might freeze, when entering the text field.

pdflatex version: pdfTeX 3.14159265-2.6-1.40.16 (TeX Live 2015) kpathsea version 6.2.1 Copyright 2015 Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX). There is NO warranty. Redistribution of this software is covered by the terms of both the pdfTeX copyright and the Lesser GNU General Public License. For more information about these matters, see the file named COPYING and the pdfTeX source. Primary author of pdfTeX: Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX). Compiled with libpng 1.6.19+apng; using libpng 1.6.19+apng Compiled with zlib 1.2.8; using zlib 1.2.8 Compiled with poppler version 0.32.0

texlive-2015 (meta) with texlive-latex-2015 (containing hyperref).

Best regards, Manuel Ullmann Manuel Ullmann LOFormWin-uncomp.pdf lininkevi linpdtmup bugexMPEres.diff.txt

jeffvalk commented 6 years ago

Still experiencing this exact issue with any poppler-based viewer on Linux. Library versions are as follows:

pdfTeX 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian)
kpathsea version 6.2.3
Compiled with libpng 1.6.31; using libpng 1.6.34
Compiled with zlib 1.2.11; using zlib 1.2.11
Compiled with poppler version 0.57.0

It's not limited to poppler though. Hyperref-generated radio buttons do not display correctly in the Mac OS X Preview application either. Here is a screenshot of this in Preview on macOS 10.13 (High Sierra):

preview

marcvinyals commented 6 years ago

It appears that hyperref generates radio buttons that do not conform to the specification. According to the PDF 1.7 reference, section 12.7.4.2.4 "Radio Buttons" p.441, there should be one radio button field that acts as a container in addition to one radio button for each choice in the field. The choice buttons are supposed to be children of the container rather than direct elements of the form.

Currently what hyperref does (or at least the pdftex part) is to generate one radio button for each choice in the field and then, for some reason, it only adds the first button as a form field. The buttons all have the same name, and I assume that there is a repair heuristic in some PDF viewers that groups buttons by name and decides to create a radio field with them if they are not part of any field (but this is only a conjecture).

It is not too hard to fix this by hand: generate another radio button, replace the entry in the form fields array with this a reference to this new button, add references to the old button as elements of a Kids array in the new button, and add a reference to the new button as a Parent entry in the dictionary of each of the old buttons. At least for Evince, it seems necessary to remove the Default Value and Value entries from the old buttons too. The following is an example of what I mean.

mwe.tex mwe.pdf mwe.fixed.pdf mwe.diff (abridged)

Unfortunately I do not speak TeX fluently enough to attempt a fix on my own, but I would be happy to write some pseudo-code if someone is willing to translate it.

eike-fokken commented 3 years ago

Any news on this? I'm also not fluent in tex but happy to help if anyone has some pointers.

u-fischer commented 3 years ago

@eike-fokken I will probably be able to do something about form fields in a few weeks.

eike-fokken commented 3 years ago

@u-fischer : Cool! Let me know if I can help!

nbenitez commented 3 years ago

Lastest Poppler version 21.04.0 improves support for these kind of un-standard form elements. Nonetheless hyperref package should follow the standard so it gets good support among pdf clients outside Adobe.

u-fischer commented 3 years ago

@nbenitez I wrote a first test version to replace the checkbox code, it is part of the pdfmanagement-testphase package. I will probably release code for textfields sometimes this month. Other field types will follow.

But all the new code will require the new pdfmanagement and it will not work with older latex.

hyiltiz commented 3 years ago

@u-fischer could you provide a mwe using the pdfmanagement-testphase package (and hyperref?) to create a radio button group, please? I am trying to read back in the PDF created via hyperref (with just a few groups of radio buttons) using the python pikepfd (which is a wrapper library around qpdf) with the snippet here [0], but for reasons pointed out here, it seems the values being read out are just the default values and the value of the first radio button. I guess I might be able to edit the python snippet to work-around this bug, but ultimately fixing it would be a much better solution so it fixes all readers, not just this snippet.

u-fischer commented 3 years ago

@hyiltiz The documentation is in l3pdffield-radiobutton.pdf. The file shows also an example, and the source is in the dtx. It doesn't need hyperref. A simple example is this:

\RequirePackage{pdfmanagement-testphase}
\DeclareDocumentMetadata{uncompress,pdfversion=2.0}

\documentclass[]{article}
\usepackage{l3pdffield-testphase}
\begin{document}
 \ExplSyntaxOn
 \begin{tabular}{ccc}
  A & B & C\\
 \pdffield_radio:n{group=A,value=A}&
 \pdffield_radio:n{group=A,value=B}&
 \pdffield_radio:n{group=A,value=C}
 \end{tabular}
 \ExplSyntaxOff

\end{document}

If there is a problem, please open an issue at https://github.com/latex3/pdfresources. But be aware that I can only fix things if I get a clear indication that the output violates the specification, I will not work around deficiencies of pdf-viewers or pdf-libraries, such problems should be reported and resolved there.