unidoc / unipdf

Golang PDF library for creating and processing PDF files (pure go)
https://unidoc.io
Other
2.46k stars 250 forks source link

[BUG] PDF Creation not working as expected #536

Open hjo33 opened 7 months ago

hjo33 commented 7 months ago

Description

My team and I use a commercial unidoc/unipdf license. We use it to replace certain merge fields in a word document from a json body. Additionally we add a barcode or qrcode to the document. This works as expected and the data is added to the word document.

However if we want to generate a pdf from the given word document the rendering is completly thrown off. Converting the word document to pdf with libreoffice --headless gives the expected outcome. Pictures and tables are also not rendered correctly and the barcode/qrcode is missing.

Expected Behavior

Expected the created pdf look similar to the one created by libreoffice

Actual Behavior

Steps to reproduce the behavior:

  1. Create a word template or use the attached
  2. Use mailMerge to replace template fields with (_egac *Document )MailMerge
  3. Save doc with (_bddd *Document )SaveToFile
  4. Convert the doc to pdf with ConvertToPdf (d *_g .Document )

Attachments

Word template: example_template.docx Generated Unipdf: PDF_unidpdf.pdf Generated PDF with libreoffice PDF_libreoffice.pdf

github-actions[bot] commented 7 months ago

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/

sampila commented 6 months ago

Hi @hjo33,

Currently we are working to improve the DOCX to PDF conversion, however we still need to works on the .emf media image processing.

Here's the current results we got so far when convert example_template.docx to PDF example_template.pdf

TheGoderGuy commented 5 months ago

Heyho @sampila,

we became aware of unioffice release 1.29.1 with the "DOCX to PDF image position overlapped with text" fix. We upgraded our service with the new version, but the PDF export looks the same as before. Could you give us an Update for this Bug ?

Thanks and have a good Day!

TheGoderGuy commented 6 days ago

Hey @sampila, we updated to v1.34 and filled the example_template.docx again. There are still major errors between unidoc and libreoffice versions. See here: diff

The generated files we're opened with chromium Version 126.0.6478.114 (Official Build) snap (64-bit) example_libre.pdf example_unidoc.pdf

The code we use to fill and generate the pdf is the following: Template prepare: `doc, err := document.Open(templatePath) if err != nil { return "", err }

defer doc.Close()

log.Debug().Str("template", templatePath).Msg("mailmerge")
// replace merge fields with values from replacement json
doc.MailMerge(mappings)

fields := doc.FormFields()
log.Debug().Msgf("found %d fields", len(fields))

for _, field := range fields {
    log.Debug().Msgf("Field Name: %s\tType: %s\tValue: %s", field.Name(), field.Type(), field.Value())
    if field.Type() == document.FormFieldTypeCheckBox {
        // name can be set in word via right click on the checkbox, and setting a value in "bookmark"
        // value is either "true" or "false" for checkboxes
        val, ok := mappings[field.Name()]
        isChecked := ok && strings.ToLower(val) == "true"
        field.SetChecked(isChecked)
    }
}

log.Debug().Str("template", templatePath).Msg("apply mappings")
err = fillMappings(ctx, doc, mappings)
if err != nil {
    return "", err
}

// doc has to be copied so the eventually added images of barcodes are also exported to the PDF
renewedDoc, err := doc.Copy()
if err != nil {
    return "", err
}

temporaryDocxFile, err := os.CreateTemp(directory, "*.docx")
if err != nil {
    return "", fmt.Errorf("can not create temporary file (%s) for docx with filled mappings: %w", temporaryDocxFile.Name(), err)
}
defer temporaryDocxFile.Close()

log.Debug().Msgf("processFile(). Dir: %s.", directory)
err = renewedDoc.SaveToFile(temporaryDocxFile.Name())
if err != nil {
    return "", fmt.Errorf("could not save docx copy with filled mappings to file: %w", err)
}
defer renewedDoc.Close()`

PDF generation: `completed, err := document.Open(path) if err != nil { return fmt.Errorf("unidoc failed to open source file: %w", err) } defer completed.Close()

pdfDoc := convert.ConvertToPdf(completed)
err = pdfDoc.WriteToFile(pdfFile)
if err != nil {
    return fmt.Errorf("unidoc failed to generate PDF: %w", err)
}`

Versions are: github.com/unidoc/unioffice v1.34.0 github.com/unidoc/unipdf/v3 v3.59.0

If you need more infos or details don't hesitate to ask.

sampila commented 6 days ago

Hi @TheGoderGuy,

Yes we can confirm the issue on chrome or chromium for the pdf results, but we use adobe acrobat to visually verify the results.

We will try to find out what's causing issue when viewed on chrome or chromium pdf.

Here's the results on adobe acrobat example_libre.pdf and example_unidoc.pdf

Screenshot 2024-06-27 at 11 00 53 Screenshot 2024-06-27 at 11 01 07

almost identical.

martint17r commented 5 days ago

@sampila Thank you for clarifying - our customers mostly use chrome, so the solace of having nearly identical display in Adobe is nice to know, but not relevant to achieve customer satisfaction on our side.

sampila commented 5 days ago

Hi @martint17r

Yes, our team currently investigate this issue, we would like to achieve same results on pdf web viewer also.