Closed mwaschkowski closed 7 years ago
Manual modofication is the way to go im afraid. Hummus can help with parsing and writing but theres no higher level method. Look for the acroform object at the catalog and iterate from there. Let me know if you need help. Havent done this but may be able to help (plus its time i have an example/module for this...getting questions about this all the time).
Gal
OK, great, thanks for the quick reply. I'll try it out and let you know how it goes, over the weekend.
Have a great day, and thank you for such a great library!
Mark
On Wed, Dec 16, 2015 at 11:08 AM, gal kahana notifications@github.com wrote:
Manual modofication is the way to go im afraid. Hummus can help with parsing and writing but theres no higher level method. Look for the acroform object at the catalog and iterate from there. Let me know if you need help. Havent done this but may be able to help (plus its time i have an example/module for this...getting questions about this all the time).
Gal
— Reply to this email directly or view it on GitHub https://github.com/galkahana/HummusJS/issues/57#issuecomment-165157584.
Hello! I work with Mark on this task about fills PDF fields.
I little stuck on figure out how this should work in common way. I reviewed docs on wiki regarding to modification. And example with modifying page and adding comments.
So looks like I can't just take object, set necessary value for it (fill it).
I need get all fields objects (fields from acroform catalog) from source pdf file, then create the same objects (with the same properties) in modified pdf file. And for these new object to set necessary value (fill it). Is this right direction or not? Let me know if something not clear I will try explain it better.
Thanks!
Absolutely correct. to modify an object you need to recreate a full version of it, copying what needs to remain the same and change what is to be changed.
Hi!
I retrieved necessary objects from Acroform - fields array. var catalog = pdfReader.queryDictionaryObject(pdfReader.getTrailer(), 'Root'); var acroform = pdfReader.queryDictionaryObject(catalog, 'AcroForm'); var fieldsRoot = pdfReader.queryDictionaryObject(acroform, 'Fields');
Then, I implemented recreate a full version of objects which I need to modify (fields). So now I have a set of objectIDs.
On next step I need to update Acroform with these new objects.
Could you help me with this? How to update/recreate Acroform?
Thanks!
hummfff. assuming that acroform is a pointer to a remote object you can recreate it by creating a modified version of the acroform object. see example here of how to create a modified version of a page object...you can see from there what's relevant to your case: https://github.com/galkahana/HummusJS/blob/master/tests/ModifyingExistingFileContent.js#L150 [if not, i can point you further]
if it's a direct object...then this calls for modifying the catalog object so that it has a new definition of an acroform object, which is similar again to creating a new modified version.
[i will know best if you send me a sample PDF]
Gal.
Hello!
Main issue here that I want to recreate Fields array of Acroform object - how this done the example (link you provided: modifying page object for write Annots). But I cannot get access to Acroform object for create modified version of it.
E.g for modify page object we can get it ID - e.g: copyingContext.getSourceDocumentParser().getPageObjectID(2);
Next: we recreate page object, copying objects which no need to change. Then write modified Annots object etc.
Working with Acroform object I faced with issue that I can't get it ID.
I getting it via parser:
var catalog = pdfReader.queryDictionaryObject(pdfReader.getTrailer(), 'Root'); var acroform = pdfReader.queryDictionaryObject(catalog, 'AcroForm'); var fieldsRoot = pdfReader.queryDictionaryObject(acroform, 'Fields');
I can't create modified version of it (by using it ID).
Do you need sample PDF file which I try to modify or my source code?
Thanks!
ah. i see. ok. to get the acroform id use the direct dictionaery access instead of going through pdfReader.queryDictionaryObject. like this:
var acroformID = catalog.queryObject('AcroForm').toPDFIndirectObjectReference().getObjectID();
i'm trusting here that the AcroForm value is an indirect object reference, and so getObjectID would work. try first...if doesn't work let's look to do something else.
Even if you don't have more questions - I would really love to get a sample PDF file with a form so i can make my own tests. more so, if you do.
Gal.
thank you, I will try it. My sample PDF file: Rockwood - Cyber Liability Insurance App.pdf
So I used this way for getting acroformID. I can write Fields objects by this way:
objectsContext.startModifiedIndirectObject(acroformID);
var modifiedAcroformObject = pdfWriter.getObjectsContext().startDictionary();
var copyingContext = pdfWriter.createPDFCopyingContextForModifiedFile();
Object.getOwnPropertyNames(acroformObject).forEach(function(element, index, array) {
if (element != 'Fields') {
modifiedAcroformObject.writeKey(element);
copyingContext.copyDirectObjectAsIs(acroformObject[element]);
}
});
modifiedAcroformObject.writeKey('Fields');
objectsContext.startArray();
objectsContext.writeIndirectObjectReference(fieldObjectID001);
objectsContext.writeIndirectObjectReference(fieldObjectID002);
...
objectsContext.writeIndirectObjectReference(fieldObjectID_n);
objectsContext
.endArray()
.endLine()
.endDictionary(modifiedAcroformObject)
.endIndirectObject();
looks good to me. is it working?
yes, it works. remains correctly write value for field (fill field), but this is related to PDF specs.
thanks!
cool
word of advice here. i see text fields accept a "text string" for values. which is the pdf encoding method. you can use the PDFTextString class provided by hummus to do the encoding for you.
For instance, say you want to write the "V" part, you can do something like this:
modifiedFieldObject.writeKey('V');
modifiedFieldObject.writeLiteralString(new PDFTextString('hello').toBytesArray());
[i mean...never tried specifically this...but it should work. at least, worth a try].
Gal.
Hello! Yes this way(above) for write LiteralStringValue works. Thanks!
But looks like I stuck with filling values to fields.
I implemented rewrite objects from Fields array of Acroform objects. I can correctly write value (V) for necessary fields. Example:
if (fieldJSDict.FT.value === "Btn") {
dictionaryContext.writeKey("V").writeNameValue(value);
dictionaryContext.writeKey("AS").writeNameValue(value);
} else {
dictionaryContext.writeKey('V').writeLiteralStringValue(new hummus.PDFTextString(value).toBytesArray());
}
But result PDF not looks good. Checkboxes/RadioButtons not filled. Text fields filled, but by some strange way:
One field filled, but others empty:
When I click on field, will be showed filled value:
Questions:
Question: can I use by some way: copyingContext.copyDirectObjectAsIs() when I recreate field objects?
My result PDF file: rockwood--cyber-liability-insurance-app.pdf_modified.pdf
Thanks!
Hi Alexey,
Good job on the text...that's one part working well.
Checkboxes
look into the value
that you provide for the checkbox in:
dictionaryContext.writeKey("V").writeNameValue(value);
dictionaryContext.writeKey("AS").writeNameValue(value);
They should match relevant appearance streams.
It think that you can assume that most use Yes
and Off
respectively. (but i guess your input value is probably something like boolean true/false.
Text fields and weird copying behavior I would be in a better position to answer if there's a problem with the fields copying if i'll have a code sample of what you are trying to do.
In general, My suggestion would be not to create new objects, but rather create modified versions. This way the referenced object ID remains the same and you don't need to worry about referencing obsolete IDs [note that this is what you do to the acroform element, right?].
best would be to receive some sample code for you that i can help with debugging.
i'm attaching here some code that i just wrote for parsing form fields values. This way (and if i don't have bugs) you should be able to tell if the values are set as you intended.
Regards, Gal. test.js.zip
Hi Gal!
Here my code:
.......
var copyingContext = pdfWriter.createPDFCopyingContextForModifiedFile();
var objectsContext = pdfWriter.getObjectsContext();
var pdfReader = copyingContext.getSourceDocumentParser();
var catalog = pdfReader.queryDictionaryObject(pdfReader.getTrailer(), 'Root');
var acroform = pdfReader.queryDictionaryObject(catalog, 'AcroForm');
var fieldsRoot = pdfReader.queryDictionaryObject(acroform, 'Fields');
var objects = []; // updated fields object IDs
var allFieldObjects = [];
getFieldsObjects(fieldsRoot, allFieldObjects, pdfReader); // see implementation below
for (var i = 0; i < allFieldObjects.length; ++i) {
var fieldJSDict = allFieldObjects[i];
var newFieldObject = objectsContext.startNewIndirectObject();
objects.push(newFieldObject);
var dictionaryContext = objectsContext.startDictionary();
writeObjectToContext(dictionaryContext, fieldJSDict, pdfWriter); // see implementation below
var name = recursiveBuildFieldName(pdfReader, fieldJSDict, fieldJSDict.T.toText());
var value = body[name];
if (body.hasOwnProperty(name)) {
if (fieldJSDict.FT.value === "Btn") {
dictionaryContext.writeKey("V").writeNameValue(value);
dictionaryContext.writeKey("AS").writeNameValue(value);
}
else {
dictionaryContext.writeKey('V').writeLiteralStringValue(new hummus.PDFTextString(value).toBytesArray());
}
}
objectsContext
.endDictionary(dictionaryContext)
.endIndirectObject();
}
var acroformID = catalog.queryObject('AcroForm').toPDFIndirectObjectReference().getObjectID();
var acroformObject = pdfReader.parseNewObject(acroformID).toJSObject();
objectsContext.startModifiedIndirectObject(acroformID);
var modifiedAcroformObject = pdfWriter.getObjectsContext().startDictionary();
Object.getOwnPropertyNames(acroformObject).forEach(function(element, index, array) {
if (element != 'Fields') {
modifiedAcroformObject.writeKey(element);
copyingContext.copyDirectObjectAsIs(acroformObject[element]);
}
});
modifiedAcroformObject.writeKey('Fields');
objectsContext.startArray();
for (var i = 0; i < objects.length; i++) {
objectsContext.writeIndirectObjectReference(objects[i]);
}
objectsContext
.endArray()
.endLine()
.endDictionary(modifiedAcroformObject)
.endIndirectObject();
pdfWriter.end();
/**
* Recursively walk through sourceFieldsArray and extract all Fields objects to fields. *
*/
function getFieldsObjects(sourceFieldsArray, fields, pdfReader) {
for (var i = 0; i < sourceFieldsArray.getLength(); ++i) {
var origObj = pdfReader.queryArrayObject(sourceFieldsArray, i);
var fieldJSDict = origObj.toJSObject();
if (fieldJSDict.FT) {
fields.push(fieldJSDict);
}
if (fieldJSDict.Kids) {
getFieldsObjects(fieldJSDict.Kids, fields, pdfReader);
}
}
}
/**
* Write field object to context. Writes dictionary object to context. Recursively.
*/
function writeObjectToContext(dictionaryContext, fieldJSDict, pdfWriter) {
Object.getOwnPropertyNames(fieldJSDict).forEach(function(element, index, array) {
if (element != "V" && element != "AS") {
var value = fieldJSDict[element];
var type = value.constructor.name;
try {
type = value.getType();
}
catch (e) {
// skip
console.info("error");
}
console.info("Write key to context:" + element + " type:" + value.constructor.name);
var objectsContext = pdfWriter.getObjectsContext();
switch (type) {
case hummus.ePDFObjectDictionary:
dictionaryContext.writeKey(element);
var apDict = objectsContext.startDictionary();
writeObjectToContext(apDict, value.toJSObject(), pdfWriter);
objectsContext.endDictionary(apDict);
break;
case hummus.ePDFObjectIndirectObjectReference:
dictionaryContext.writeKey(element).writeObjectReferenceValue(value.getObjectID());
break;
case hummus.ePDFObjectLiteralString:
dictionaryContext.writeKey(element).writeLiteralStringValue(pdfWriter.createPDFTextString(value.value).toBytesArray());
break;
case hummus.ePDFObjectInteger:
case hummus.ePDFObjectReal:
dictionaryContext.writeKey(element);
objectsContext.writeNumber(value.value);
break;
case hummus.ePDFObjectName:
dictionaryContext.writeKey(element).writeNameValue(value.value);
break;
case hummus.ePDFObjectArray:
dictionaryContext.writeKey(element);
if (element == "Rect") {
dictionaryContext.writeRectangleValue(value.toJSArray()[0].value, value.toJSArray()[1].value, value.toJSArray()[2].value, value.toJSArray()[3].value);
}
else {
objectsContext.startArray();
var arrayJs = value.toJSArray();
for (var k = 0; k < arrayJs.length; k++) {
var item = arrayJs[k];
try {
switch (item.getType()) {
case hummus.ePDFObjectInteger:
objectsContext.writeNumber(value);
break;
case hummus.ePDFObjectIndirectObjectReference:
objectsContext.writeObjectReferenceValue(value.getObjectID());
break;
default:
console.error("Need to implement write array element value for type: " + item.constructor.name);
break;
}
}
catch (e) {
writeObjectToContext(dictionaryContext, item, pdfWriter);
}
}
objectsContext.endArray().endLine();
}
break;
case "Number":
objectsContext.writeNumber(value);
break;
default:
console.error("Need to implement write to context for type:" + type);
break;
}
}
});
}
Hi, Thanks for the code. looks wonderful. some notes:
I would recommend placing conditions - for text box condition only the V, for checkbox both V and AS. I would also recommend making tighter check on the chekbox types. not just that they are buttons. unless you have knowledge that all BTN types are checkboxes in your form. Just check the Ff value (you can look at the code i sent you to realize how to verify that).
I would also recommend to condition the modified copying version by if(body[name])
. this way fiedls that don't get autocomplete will regain their original value. (for them just copy the whole object as is, and don't bother with checking V or AS.
startNewIndirectObject
. rather use startModifiedIndirectObject
with the original reference ID. This way for indirect object fields, you will retain the same reference ID, and so no obsolete IDs, and for fields that are direct objects...well...no need to worry about someone else referencing thiem...so no problem here.Gal.
and writing should probably maintain the heirarchy, no? i see reading is recursive, but writing just write all the objects under the main fields array. you should probably only have to write the top level ones.
Thank you for advices! I will rework my code according to them and will let you know.
Hello! Ok I did changes according to your notes. What I have at this point:
........
var pdfWriter = hummus.createWriterToModify(pdfFileName, {
modifiedFilePath: pdfFileName + "_modified.pdf"
});
var copyingContext = pdfWriter.createPDFCopyingContextForModifiedFile();
var objectsContext = pdfWriter.getObjectsContext();
var pdfReader = copyingContext.getSourceDocumentParser();
var catalog = pdfReader.queryDictionaryObject(pdfReader.getTrailer(), 'Root');
var acroform = pdfReader.queryDictionaryObject(catalog, 'AcroForm');
var fieldsRoot = pdfReader.queryDictionaryObject(acroform, 'Fields');
var objects = []; // updated fields object IDs
writeValuesToFields(pdfReader, pdfWriter, copyingContext, objectsContext, fieldsRoot, body, objects);
var acroformID = catalog.queryObject('AcroForm').toPDFIndirectObjectReference().getObjectID();
var acroformObject = pdfReader.parseNewObject(acroformID).toJSObject();
objectsContext.startModifiedIndirectObject(acroformID);
var modifiedAcroformObject = pdfWriter.getObjectsContext().startDictionary();
Object.getOwnPropertyNames(acroformObject).forEach(function(element, index, array) {
if (element != 'Fields') {
modifiedAcroformObject.writeKey(element);
copyingContext.copyDirectObjectAsIs(acroformObject[element]);
}
});
modifiedAcroformObject.writeKey('Fields');
objectsContext.startArray();
for (var i = 0; i < objects.length; i++) {
objectsContext.writeIndirectObjectReference(objects[i]);
}
objectsContext
.endArray()
.endLine()
.endDictionary(modifiedAcroformObject)
.endIndirectObject();
pdfWriter.end();
I decided write objects in the context at the same time when I iterate them. (Do not collect them at first step). Also I actively use startModifiedIndirectObject(objID) - do not change objectID. And I use detect type of field - for apply to it custom logic for write value:
function writeValuesToFields(pdfReader, pdfWriter, copyingContext, objectsContext, objects, fieldsToModify, updatedObjectIDs) {
for (var i = 0; i < objects.getLength(); ++i) {
var obj = pdfReader.queryArrayObject(objects, i);
var objID = objects.toJSArray()[i].getObjectID();
var fieldJSDict = obj.toJSObject();
updatedObjectIDs.push(objID);
var processKids = true;
if (fieldJSDict.T) {
var name = recursiveBuildFieldName(pdfReader, fieldJSDict, fieldJSDict.T.toText());
if (fieldsToModify.hasOwnProperty(name)) { // should we for this object write value?
if ("topmostSubform[0].Page1[0].RadioButtonList[0]" === name) {
console.info("break here.");
}
var value = fieldsToModify[name];
var type = getFieldType(fieldJSDict); // I created method for detect type of field, I used your code which you wrote in the test.js file.
//yes
console.info("Process field: " + name);
objectsContext.startModifiedIndirectObject(objID);
var dictionaryContext = objectsContext.startDictionary();
var skip = "";
switch (type) {
case FIELD_TYPES.TEXT:
dictionaryContext.writeKey('V').writeLiteralStringValue(new hummus.PDFTextString(value).toBytesArray());
break;
case FIELD_TYPES.RADIO: // probably for radio need custom logic for write value:
dictionaryContext.writeKey("V").writeNameValue(value);
dictionaryContext.writeKey("AS").writeNameValue(value);
skip = "AS";
break;
case FIELD_TYPES.CHECKBOX:
dictionaryContext.writeKey("V").writeNameValue(value);
dictionaryContext.writeKey("AS").writeNameValue(value);
skip = "AS";
break;
}
writeObjectToContext(dictionaryContext, fieldJSDict, pdfWriter, skip);
objectsContext
.endDictionary(dictionaryContext)
.endIndirectObject();
}
else {
//no
console.info("Just copy field: " + name);
copyingContext.copyDirectObjectAsIs(obj);
}
}
if (fieldJSDict.Kids && processKids) {
writeValuesToFields(pdfReader, pdfWriter, copyingContext, objectsContext, fieldJSDict.Kids, fieldsToModify, updatedObjectIDs);
}
}
}
This is function little changed:
/**
* Write field object to context. Writes dictionary object to context. Recursively.
*/
function writeObjectToContext(dictionaryContext, fieldJSDict, pdfWriter, skip) {
Object.getOwnPropertyNames(fieldJSDict).forEach(function(element, index, array) {
if (element !== "V" && (!skip || element !== skip)) {
var value = fieldJSDict[element];
var type = value.constructor.name;
try {
type = value.getType();
}
catch (e) {
// skip
console.info("error");
}
console.info("Write key to context:" + element + " type:" + value.constructor.name);
var objectsContext = pdfWriter.getObjectsContext();
switch (type) {
case hummus.ePDFObjectDictionary:
dictionaryContext.writeKey(element);
var apDict = objectsContext.startDictionary();
writeObjectToContext(apDict, value.toJSObject(), pdfWriter, skip);
objectsContext.endDictionary(apDict);
break;
case hummus.ePDFObjectIndirectObjectReference:
dictionaryContext.writeKey(element).writeObjectReferenceValue(value.getObjectID());
break;
case hummus.ePDFObjectLiteralString:
dictionaryContext.writeKey(element).writeLiteralStringValue(pdfWriter.createPDFTextString(value.value).toBytesArray());
break;
case hummus.ePDFObjectInteger:
case hummus.ePDFObjectReal:
dictionaryContext.writeKey(element);
objectsContext.writeNumber(value.value);
break;
case hummus.ePDFObjectName:
dictionaryContext.writeKey(element).writeNameValue(value.value);
break;
case hummus.ePDFObjectArray:
dictionaryContext.writeKey(element);
if (element == "Rect") {
dictionaryContext.writeRectangleValue(value.toJSArray()[0].value, value.toJSArray()[1].value, value.toJSArray()[2].value, value.toJSArray()[3].value);
}
else {
objectsContext.startArray();
var arrayJs = value.toJSArray();
for (var k = 0; k < arrayJs.length; k++) {
var item = arrayJs[k];
try {
switch (item.getType()) {
case hummus.ePDFObjectInteger:
objectsContext.writeNumber(item.value);
break;
case hummus.ePDFObjectIndirectObjectReference:
objectsContext.writeIndirectObjectReference(item.getObjectID());
break;
default:
console.error("Need to implement write array element value for type: " + item.constructor.name);
break;
}
}
catch (e) {
writeObjectToContext(dictionaryContext, item, pdfWriter, skip);
}
}
objectsContext.endArray().endLine();
}
break;
case "Number":
objectsContext.writeNumber(value);
break;
default:
console.error("Need to implement write to context for type:" + type);
break;
}
}
});
}
Field values I recieve after submitting HTML form. This form was created before, via parsing source PDF file. We also use hummusJS for parse fields and generate HTML.
So I have map-object "body" with pairs:
Possible that the empty text fields dont have defs for variable text, somehow, in which case you may want to add the required appearance stream template.
Good going! Gal
Hello! Yes, right for correctly show text fields value need have a correct "appearance stream".
I tried update existing AP by this way:
objectsContext.startModifiedIndirectObject(apObjectId); // apObjectID - ID of AP for update
// create Dictionary for write to stream:
var apDict = objectsContext.startDictionary();
// copy / create keys/values from correct AP (which show text) or create a new one:
<copy/create dict keys etc>
objectsContext.endDictionary(apDict);
var streamCxt = objectsContext.startPDFStream(apDict); // this throws segmenation fault error.
objectsContext.endPDFStream(streamCxt);
How to write dictionary to stream I found in the docs: https://github.com/galkahana/HummusJS/wiki/Extensibility#pdf-streams
Could you clarify please how I can modify existing AP (e.g: write resources: Font, etc)? Main issue here how to write dictionary to stream.
Thanks!
Hi Alexey. great job. I need to read into this and I was very busy. hopefully i can find some time today evening, so i can realize what needs to be done.
from little that i read in "8.4.4 Appearance Streams" they are actually form xobjects. In that case i would recommend (and hopefully its the right way) to simply create a form xobject with the new appearance as a representative of the old object, replacing the existing stream. Using a form xobject will take care of resources by itself. you can read about forms in hummus here - https://github.com/galkahana/HummusJS/wiki/Reusable-forms
you can start a form xobject with an existing object ID by passing 5 numbers instead of 4 (the initial 4 are the bbox of the form, which you can read from the existing AP or create new ones of your own).
When i get back home today i'll try to find time to read more, to see that we're not missing something.
Gal.
Hello! Thank you very much for answer! I will try use xobjects. I going read about it additionally too.
k. so here's what i think is going on, having read 8.6.2 [form fields], 8.4.4 [appearance streams] and 8.4.5 [annotations types, specifically widget annotation].
Any form fields that is terminal should either have a kids array with widget annotations describing its display, or be itself a widget annotation. by being itself, i mean that it will not have a kids array, but rather contain the widget annotation keys in itself.
widget annotation includes "AP" to describe the appearance stream. this AP points to a form xobject that describes the appearance of the field. that's one appearance stream to keep in mind. let's call it AP. i'm guessing that in case of text field, then if there is one, it defines whats displayed before you start editing in acrobat. I am guessing (and that's pure guess, but can be verified by parsing) that acrobat updates this AP if the file gets saved, to reflect the new appearance. I think that this must happen for proper later printing.
In addition a text field will have a "DA" appearance string (string! not stream) which defines variable appearance. you need variable appearance because the form will have text edited in acrobat and you want to understand how to show it AND how to construct an AP form xobject once the file is saved.
From what i parsed from your example DAs can look like this: "/Helv 9 Tf 0 g", which means helvetica size 9, in black.
so long story short, i expect that you should already have a DA in text fields, you just need to recreate an AP, N (normal) stream. To do this create a form xobject using the instructions in 8.6.2 variable text. the content context of hummus should have all the required commands (Q, q etc) and will also take care of creating font definitions and embed the correct glyphs in the same way as you use writeText. you can obviously use writeText itself. Note that hummus requires that form xobjects are created for NEW object IDs and not moified, so you'll have to change the AP/N entry in the widget annotation to point to the new form xobjects. hope that's not too much of trouble. the code can be changed to reuse object IDs of the old form, but its slightly complex. so hopefully you can do without it.
It might be a good idea to parse a file that has a good appearance stream and make sure that my notes are valid.
Good luck, Gal.
Hello! Main idea looks like clear. Thank you for explanation!
What I did:
(apObjectId - existing AP)
var apObject = pdfReader.parseNewObject(apObjectId).getDictionary().toJSObject();
var xobjectForm = pdfWriter.createFormXObject(apObject.BBox.toJSArray()[0].value, apObject.BBox.toJSArray()[1].value, apObject.BBox.toJSArray()[2].value, apObject.BBox.toJSArray()[3].value);
var font = pdfWriter.getFontForFile(__dirname + '/arial.ttf');
xobjectForm.getContentContext()
.BT()
.q()
.k(0, 0, 0, 1)
.Tf(font, 9)
.Tj(value)
.ET()
.Q();
pdfWriter.endFormXObject(xobjectForm);
var id = // XObject id; (xobjectForm.id)
dictionaryContext.writeKey('V').writeLiteralStringValue(new hummus.PDFTextString(value).toBytesArray());
dictionaryContext.writeKey('AP');
var apDict = objectsContext.startDictionary();
apDict.writeKey("N").writeObjectReferenceValue(id); // new ID write here
objectsContext.endDictionary(apDict);
But this is still doesn't work for me. Could you look at my code and let me know is it correct approach for recreate AP?
Code seems fine to me. perhaps a DA string is required? at times like this i like to see what acrobat does. how about you take a pdf with a form that has a field that shows the problem, fill it op in acrobat, save and parse the result, figuring what they put in their Field object and widget annotation object?
Gal.
Hello!
1) Field which showed correctly (1 - Name of Organization):
DA="/Helv 9 Tf 0 g"
AP->N->3 (3 is ID of reference object - stream)
Reference object - stream contents:
Bbox=0, 0, 436.4403, 12.47998
Filter=FlateDecode
Length=82
Resources=[
Font=Helv (PDFIndirectObjectReference
ProcSet=PDF,Text
]
Subtype=Form
Type=XObject
2) Field which not showed correctly (2 - Other (describe): ):
DA="/Helv 9 Tf 0 g"
AP->N->31 (31 is ID of reference object - stream)
Reference object - stream contents:
Bbox=0, 0, 72.00003, 15.47998
Length=0
Subtype=Form
Type=XObject
So main difference - field which showed correctly has Length!=0 and Resources.
I filled Name and address values above, by using online tool for fill PDF
well it's clear why "describe" has not stream - it's empty. it makes sense. you didn't fill describe. thats ok.
what we are trying to understand is why when your code actually writes something it doesn't work. can you compare by filling fields with that online tool and with your code? then see if the application does something different to figure out whats wrong.
Looks like I make it works!
I use the same code only made several changes:
Still remains figure out how to use standart fonts from PDF (Section 5.5 Simple Fonts). And make adjustments for showing filled fields more better (make nice look - right now need adjust font size, position etc).
Brilliant!!! Note that for convenience you can use the writeText method of content context instead of the BT...ET sequence. it just does that for you. [still might need nesting in Q...q].
Anyways, this is great news. Seems like we have a working solution in our hands!
Some notes, per the future queries:
As for fonts. Hummus pretty much hides the usage of fonts due to its complexity. when you use a font the pdf file has to include the font definition with all the glyphs that you use. Hummus does that for you.
So essentially, to use any font just call writeText
or Tf
with a used font object as you did.
If, for instance, you want to use helvetica, just point to the helvetica text file, as you did with Arial and done.
If you still want to use standard type 1 fonts, without embedding them, I think that the way to go is to call Tf with the font name as string - cxt.Tf('Helcetica',14)
. give it a try if you want.
If you want to simply write your own code with a content context you cal do this:
cxt.writeFreeCode('hello world\r\n')
[note that you must finish your code with ending \r\n regarding of the OSX that you are using]. this can become useful if you want to use the DA string for variable text form fields.
Good luck! Gal.
Thanks for advices. Will work on adjustments and final changes.
Thanks a lot for all your support here. You very helped for me dive deep in the PDF format and understand how to use this library!
Thanks!
@fedyunin would you be willing to post your sample code? Trying to figure out how to modify a pdf I have.
@fedyunin I'm too looking for tool for fill forms, could you publish your complete code? thank you
@fedyunin Could you please publish the complete code along with the before and after pdf files on here ?
Anyone ever figure out how to flatten a form after filling it out? I have everything working from this thread (awesome btw) but I still need to make things read only, flatten, the pdf that is sent out in my new hummus.PDFStreamForResponse(res)
wrote working sample code. it's annotated so you can figure out bugs if you can see them. will publish a post with explanations soon, but the code should be beneficial before hand: https://github.com/galkahana/HummusJSSamples/tree/master/filling-form-values
usage: https://github.com/galkahana/HummusJSSamples/blob/master/filling-form-values/main.js
implementation code: https://github.com/galkahana/HummusJSSamples/blob/master/filling-form-values/pdf-form-fill.js
Anyone ever figure out how to flatten a form after filling it out? I have everything working from this thread (awesome btw) but I still need to make things read only, flatten, the pdf that is sent out in my new hummus.PDFStreamForResponse(res)
@trendoid Did you find a way to flatten the pdf after filling out the form?
Update: Solved it by setting the inputs to read-only. Works for our use-case
@cjnqt
Update: Solved it by setting the inputs to read-only. Works for our use-case I need to do the same but cannot figure out how to set inputs as read only :/
did it when i created the form, using adobe acrobat or pdfescape.com. might also be possible to do with hummusjs, dont know
Anyone know how to set the inputs to read-only programmatically? I need to "flatten" my pdf form fields after I write a value to them.
@galkahana Also needed in flattening the pdf file, any suggestions how to do that ?
@galkahana Many thanks for sharing the JS code to update forms. I'm trying to re-use it in a C++ application, but I'm stuck here:
// otherwise, recreate the form as an indirect child (this is going to be a general policy, we're making things indirect. it's simpler), and recreate the catalog
var catalogObjectId = reader.getTrailer().queryObject('Root').toPDFIndirectObjectReference().getObjectID();
What is the C++ variant of toPDFIndirectObjectReference()
? How can I get the object's id by its pointer? Thanks!
you can use direct casting or PDFObjectCastPtr
(https://github.com/galkahana/PDF-Writer/wiki/PDF-Parsing#pdfobjectcastptr) to case a PDFObject to other types.
in this case queryObject will bring you a PDFObject
and you want to case to PDFIndirectObjectReference
.
this is not getting an object ID by its pointer. it's just that the object that is the value of the root key in the trailer is an indirect object ref.
@galkahana Thanks! I decided to stick with the JS version, i.e. with HummusJS on Ubuntu 18.04. I use your great tutorial found here: https://github.com/galkahana/HummusJSSamples/tree/master/filling-form-values . I have a small PDF with just a single text form field. I modified main.js so it updates the text field with the text "Robert". However the resulting PDF is a little bit weird. When I open the resulting PDF, the text field is empty. If I click on it, it displays the text "Robert". When the field loses focus, it reverts to the empty state back again. Also if I print the resulting PDF, the text field is empty too. Please see name.pdf (the source PDF), and name-edit.pdf (the resulting PDF).
main.js looks as follows:
var hummus = require('hummus'),
fillForm = require('./pdf-form-fill').fillForm;
var writer = hummus.createWriterToModify(__dirname + '/name.pdf', {
modifiedFilePath: __dirname + '/name-edit.pdf'
});
var data = {
"Name": "Robert"
};
fillForm(writer,data);
writer.end();
Any ideas? Thanks!
Thanks galkahana for all your work here. I used your sample pdf-form-fill successfully but it does not display the field values in Adobe Acrobat DC - but does in Chrome. I opened Issue # 21 in HummusJSSamples
Hi,
I've looked through the docs and didn't see anything mentioned, is it possible to use HummusJS to fill in a PDF form? If so, is there a call to do so, or would I have to iterate every form field and the modify it manually?
Thank you!
Mark