empira / PDFsharp

PDFsharp and MigraDoc Foundation for .NET 6 and .NET Framework
https://docs.pdfsharp.net/
Other
531 stars 132 forks source link

XFA and XDP Package Cannot update in PDF #174

Open MarianoJP opened 1 month ago

MarianoJP commented 1 month ago

First and foremost, I want to thank you for a great library. I had previously contacted @ThomasHoevel about "PdfDictionary is set but CryptFilterDecodeParms are not initialized correctly". The issue has been resolved; however, as I mentioned at the end of that post, I am working with the XDP package within the XFA Form in the PDF. I can access the XDP Data section and modify the XDP data package. For reference the PDF can be downloaded from: https://www.uscis.gov/sites/default/files/document/forms/i-130.pdf

I use the following code to get the XDP:

string USCISfilename = @"c:\Text\i-130 p31.pdf";

PdfDocument pdfDocument = PdfReader.Open(USCISfilename, PdfDocumentOpenMode.Import);
string fn = "Family Name";
PdfAcroForm fm = pdfDocument.AcroForm;
if (fm.Elements.ContainsKey("/XFA"))
{
    var PIs = fm.Elements.Values;
    PdfArray what = (PdfArray)PIs.ToArray()[3];
    PdfReference pi = (PdfReference)what.Elements[9];
    PdfDictionary pid1 = (PdfDictionary)((PdfReference)what.Elements[5]).Value;
    string decoded1 = Encoding.UTF8.GetString(pid1.Stream.UnfilteredValue);
    PdfDictionary pid2 = (PdfDictionary)((PdfReference)what.Elements[17]).Value;
    string decode2 = Encoding.UTF8.GetString(pid2.Stream.UnfilteredValue);
    File.WriteAllText(@"c:\Text\tepmlate.xml", decoded1);
    PdfDictionary pid = pi.Value as PdfDictionary;
    PdfDictionary.PdfStream pdfs = pid.Stream;

    string decoded = Encoding.UTF8.GetString(pid.Stream.UnfilteredValue);
    string decodeds = "";
    PdfArray w = (PdfArray)PIs.ToArray()[3];
    for (int x = 0; x <= w.Count(); x++)
    {
        if (x > 0)
        {
            if (x % 2 != 0)
            {
                PdfReference pr = (PdfReference)w.Elements[x];
                PdfDictionary pd = (PdfDictionary)pr.Value;
                decodeds += Encoding.UTF8.GetString(pd.Stream.UnfilteredValue);
            }
        }
    }
    XmlDocument xdoc = new XmlDocument();
    xdoc.LoadXml(decodeds);

The xdoc variable now holds a properly XML Document where I can access the nodes and modify them as needed. First I get the field nodes and then loop through it to find a specific field with the name "Pt2Line1_AlienNumber" with the following code:

List<string> fieldNames = new List<string>();

                XmlNodeList fieldNodes = xdoc.GetElementsByTagName("field");

                XmlNode Mynode = null;

                foreach (XmlNode fieldNode in fieldNodes)
                {
                    if (fieldNode.Attributes["name"] != null && fieldNode.Attributes["name"].Value == "Pt2Line1_AlienNumber")
                    {
                        Mynode = fieldNode;
                        break; // Found the node, no need to continue the loop
                    }
                }

I then use the following code to modify the field data and insert back into the XDP package and put back into the pdf dictionary:

if (Mynode != null)
    {
        // Assuming the node has a child where the value is stored in a "value" tag
        XmlNode valueNode = Mynode.SelectSingleNode(".//value");

        if (valueNode != null)
        {
            valueNode.InnerText = "A123456789"; // Update the text for the field
        }
        else
        {
            // If the value node doesn't exist, create it and set the value
            XmlElement newValueNode = xdoc.CreateElement("value");
            newValueNode.InnerText = "A123456789";
            Mynode.AppendChild(newValueNode);
        }

        // Now we need to re-encode the updated XML back into the PDF stream
        string updatedXml = xdoc.OuterXml;
        byte[] updatedXmlBytes = Encoding.UTF8.GetBytes(updatedXml);

        // Replace the old stream with the updated XML in the PDF dictionary (pid1 or pid2) I chose pid1 since pid2 seems to be the template
        pid1.Stream.Value = updatedXmlBytes; // Assuming pid1 holds the correct dictionary for the form fields

        // Save the updated PDF
        pdfDocument.Save(@"c:\Text\");
    }

The issue lies in this last line of the document. The file as it is currently with PDFSharp, cannot be opened as Modify since it is a password protected file. I cannot get the Password since it is an institution that produces forms that only need the data to be filled. Since the form is a XFA form, the data is stored in the XDP package.

How else could I insert the data into the XFA fields if my method requires the ability to modify the PDF?

Thank you again,

Mariano J. Padilla