galkahana / HummusJS

Node.js module for high performance creation, modification and parsing of PDF files and streams
http://www.pdfhummus.com
Other
1.14k stars 169 forks source link

Fillform corrupted PDFs #207

Open DamickDoni opened 6 years ago

DamickDoni commented 6 years ago

Hi,

"fillform" works great and the Output works in Browser Preview (Chrome, Firefox). All fields are visible.

BUT! If I open the pdf in Acrobat DC or Reader, the fields are invisible until I click on it.

I use "fillform" with a little modification (to write utf-8 chars and set the text 2 pts up, so the bottom is not cut)

var font = handles.writer.getFontForFile(__dirname + '/arial.ttf'); xobjectForm.getContentContext() .writeFreeCode('/Tx BMC\r\n') .q() .BT() .writeFreeCode(da + '\r\n') .Ts(2) .Tf(font,12) .writeText(text,0,0,{size: 6,colorspace: 'rgb',color: 000000,font: font}) .ET() .Q() .writeFreeCode('EMC'); handles.writer.endFormXObject(xobjectForm);

Why the fields appear invisible until I click on it?

DamickDoni commented 6 years ago

Ok, the issue comes from writeText(XYZ). If I use .Tj(text,...) all reader show it but the encoding doesn't work (ü is shown as a rectangle with an upper circle :-) )

galkahana commented 6 years ago

Writetext is to be used instead of the whole bt...et hupla. Its a shortcut. Hiwever if you want the ts part then just replace with tj. As for the rectabgle, make sure thatvthe font supports this char. Try first using writetext/tj in a regular context with this text...tgen tie to the form code

DamickDoni commented 6 years ago

can u give me an example? (Don't understand what you mean with regular context)

DamickDoni commented 6 years ago

I'm still struggling. In every browser preview, the pdf is shown correctly. But when I open the pdf in Acrobat or another PDF reader, the fields are empty until I click on it.

galkahana commented 6 years ago

you're gonna need to start debugging this. my suggestion to you. read the pdf specs, see what's wrong and fix it. you got the code there, you already know the code that creates the appearance stream. i'd help you, but im terribly busy. for sure what you can't do is "writeText" in the middle of all this block. try using Tj instead of writeText. just replace "writeText" with Tj, and provide as parameeter the text. if you are not seeing the characthers that you expect - use a different font. arial may not support the chars that you want.

if you are still suspecting that something is wierd with the font, just try this font outside of the form example, like in a code that's used for just writing a text inside a pdf.

the end result will write a certain stream into the PDF...if it's good then this will work. if still not working, try comparing to PDFs that you fill manually. compare the output stream, see if you still got problems there.

DamickDoni commented 6 years ago

After debugging the code I found the error(s). But it's too much to do, so I decided not to use Hummus anymore and try other packages.

Hatzl commented 6 years ago

I have exactly the same problem. This is maybe a bug in hummusjs? @DamickDoni can you tell me the errors? Maybe I can fix it.. I think the errors were in hummus's, not in your PDF?

tonybranfort commented 5 years ago

May be same issue as https://github.com/galkahana/HummusJSSamples/issues/21. The only difference is the values don't appear in the fields even when clicking on them in Acrobat. The field values do appear in Chrome and when I parse the PDF with HummusJS.

@Hatzl I'd be more than happy to work with you to try to figure this out. I know little about PDFs but I can do some deep grunt work if you want to do some pointing. I'd like to get this working with HummusJS.

Hatzl commented 5 years ago

@tonybranfort I think you are using the "pdf-form-fill.js" file from HummusJSSamples?

Try this: Edit the pdf-form-fill.js.

  1. on top of the file add: var fontArial;

  2. in function fillForm add this:

 function fillForm(writer,data) {

// --- ADD THIS HERE ---
fontArial = writer.getFontForFile(__dirname + '/arial.ttf');

// setup parser
var reader = writer.getModifiedFileParser();
.....
} 
  1. change function writeAppearanceXObjectForText to this:
function writeAppearanceXObjectForText(handles,formId,fieldsDictionary,text,inheritedProperties) {
    var rect = handles.reader.queryDictionaryObject(fieldsDictionary,'Rect').toPDFArray().toJSArray();
    let da = fieldsDictionary.exists('DA') ? fieldsDictionary.queryObject('DA').toString():inheritedProperties['DA'];

    var xobjectForm = handles.writer.createFormXObject(
        0,
        0,
        rect[2].value - rect[0].value,
        rect[3].value - rect[1].value,
        formId);

    xobjectForm.getContentContext()
        .writeFreeCode('/Tx BMC\r\n')
        .q()
        .BT()
        .writeFreeCode(da + '\r\n')
        .Ts(3)
        .Tf(fontArial,10)
        .Tj(text)
        .ET()
        .Q()
        .writeFreeCode('EMC');

    handles.writer.endFormXObject(xobjectForm);
}

That worked for me..

Of course you need to download and add the arial.ttf file and maybe change the path to this file in step 2. You also can take any other font or change the font size... You can ask me if you need any help.

tonybranfort commented 5 years ago

Unfortunately that didn't do it. I was hopeful as I suspected the fonts also but no-go. I tried your changes with the oo-pdfform-example.pdf just to also verify I was pulling in those arial.ttf fonts correctly - and that worked. But not with this g-28.pd as an example. Clicking in the cells doesn't make any difference. Parsing the filled pdf does show that those objects do contain the values but they just aren't appearing in the pdf. Thanks @Hatzl - I appreciate it. Let me know if any other thoughts or suggestions. I'll keep banging on it. g-28-out.pdf

Parsed output pdf fields: {... "form1[0].#subform[0].Pt1Line2a_FamilyName[0]": "Smith", "form1[0].#subform[1].Pt3Line5a_FamilyName[0]": "yokj er", ...}

tonybranfort commented 5 years ago

Recap and what I've tried so far:

Now trying to figure out next steps. Will read more of Gal's posts and wiki - I certainly am getting an appreciation for PDFs and their complexity.

Hatzl commented 5 years ago

I think its something with your PDF file.. Try to remove all form elements from the PDF and create your own form with www.pdfescape.com. Maybe with using my changes.. For testing you can try to only make a few fields..

Do you "lock" the PDF file after filling? Or should it be changeable after filling?

tonybranfort commented 5 years ago

It's a government form. There are a lot of them so I could recreate one for testing purposes but it wouldn't be feasible for production for all forms. The forms are not locked after filling - they can continue to be edited after they've been filled.

rcoryjohnson commented 5 years ago

@Hatzl @tonybranfort do either of you know how you would "lock" the PDF with hummus after using pdf-form-fill? I need the fields to no longer be editable after they are filled. This is also referred to as "flattening" the PDF. I just haven't found out how to do this with hummus yet.

Hatzl commented 5 years ago

Here: https://github.com/galkahana/HummusJSSamples/tree/lock-form/lock-form