Closed kwisatz closed 2 years ago
The issue here is that saving it somehow corrupts the XFA data, causing even e.g. Adobe Reader to refuse to open it because of XML errors. In PDF.js, when opening the re-saved document, the following warning is printed in the console:
Warning: XFA - Invalid utf-8 string.
Indeed, I initially thought that the UTF-8 warning would even appear when opening it for the first time (without any fields filled). Note that I can open the version filled and saved by pdf.js correctly in masterpdfeditor though.
The saved xml contains some tags where the names have an accentued character (é and à). If I remove them from the pdf, everything is fine in pdf.js or acrobat.
The serialized xml is a js string (utf-16) and we must encode it into utf-8 before saving: https://github.com/mozilla/pdf.js/blob/891f21fba6db64cd602c1a9a51826d7b9cd06af0/src/core/xfa/data.js#L78
@Snuffleupagus, I think this function:
https://github.com/mozilla/pdf.js/blob/891f21fba6db64cd602c1a9a51826d7b9cd06af0/src/shared/util.js#L1017
should be called utf8StringToString
and the following one stringToUTF8String
or am I wrong ?
Attach (recommended) or Link to PDF file here: CIE-XFA-work.pdf
Configuration:
Steps to reproduce the problem:
What is the expected behavior? (add screenshot)
The form should render again, displaying the field with filled in value.
What went wrong? (add screenshot)
The form only renders the first time. Saving it with at least one field filled and re-opening the saved PDF fails with
Warning: XFA Foreground documents are not supported
Additional info
Saving/Downloading the form without filling a field does not produce the error. I have also tested other PDFs (e.g. canadian-xfa-example.pdf) that do not exhibit this problem. From what I can see in the code, it would appear that saving the form somehow turns it from being
pureXfa
to losing that characteristic. https://github.com/mozilla/pdf.js/blob/c68dc03be685a5f2de5c2e99595f9bc747ffaa34/web/app.js#L1572