Closed sagar-kalburgi-ripcord closed 3 months ago
Hi @sagar-kalburgi-ripcord,
we tried to reproduce the issue, but when trying to merge the S19-1026-NLP-Tasks.pdf
with this our sample pdf document-header-and-footer-simple
, it's works fine.
Likely it's due your system doesn't have DejavuSans
font installed.
/*
* Basic merging of PDF files.
* Simply loads all pages for each file and writes to the output file.
* See pdf_merge_advanced.go for a more advanced version which handles merging document forms (acro forms) also.
*
* Run as: go run pdf_merge.go output.pdf input1.pdf input2.pdf input3.pdf ...
*/
package main
import (
"fmt"
"os"
"github.com/unidoc/unipdf/v3/common/license"
"github.com/unidoc/unipdf/v3/model"
)
func init() {
// Make sure to load your metered License API key prior to using the library.
// If you need a key, you can sign up and create a free one at https://cloud.unidoc.io
err := license.SetMeteredKey(os.Getenv(`UNIDOC_LICENSE_API_KEY`))
if err != nil {
panic(err)
}
}
func main() {
if len(os.Args) < 4 {
fmt.Printf("Requires at least 3 arguments: output_path and 2 input paths\n")
fmt.Printf("Usage: go run pdf_merge.go output.pdf input1.pdf input2.pdf input3.pdf ...\n")
os.Exit(0)
}
outputPath := ""
inputPaths := []string{}
// Sanity check the input arguments.
for i, arg := range os.Args {
if i == 0 {
continue
} else if i == 1 {
outputPath = arg
continue
}
inputPaths = append(inputPaths, arg)
}
err := mergePdf(inputPaths, outputPath)
if err != nil {
fmt.Printf("Error: %v\n", err)
os.Exit(1)
}
fmt.Printf("Complete, see output file: %s\n", outputPath)
}
func mergePdf(inputPaths []string, outputPath string) error {
pdfWriter := model.NewPdfWriter()
for _, inputPath := range inputPaths {
pdfReader, f, err := model.NewPdfReaderFromFile(inputPath, nil)
if err != nil {
return err
}
defer f.Close()
numPages, err := pdfReader.GetNumPages()
if err != nil {
return err
}
for i := 0; i < numPages; i++ {
pageNum := i + 1
page, err := pdfReader.GetPage(pageNum)
if err != nil {
return err
}
err = pdfWriter.AddPage(page)
if err != nil {
return err
}
}
}
fWrite, err := os.Create(outputPath)
if err != nil {
return err
}
defer fWrite.Close()
err = pdfWriter.Write(fWrite)
if err != nil {
return err
}
return nil
}
document-header-and-footer-simple.pdf
go run main.go output.pdf document-header-and-footer-simple.pdf S19-1026-NLP-Tasks.pdf
Could you try install the DejavuSans
font and run the code?
Hi @sampila, First of all, thank you for providing suggestions and the sample code!
I've been working with @sagar-kalburgi-ripcord on this, and I tried installing the font, which didn't solve the issue on our service.
Using the code you provided, it does work, indeed. However, we are trying to ensure PDF/A compatibility, and as such, I made a small change to your code. The same error persists after my changes, and I confirmed that the font was installed!
Given the above, I have a few questions:
DejavuSans
in particular is the issue, can we specify or override the fallback fonts?For reference, here's the code with the changes I mentioned:
/*
* Basic merging of PDF files.
* Simply loads all pages for each file and writes to the output file.
* See pdf_merge_advanced.go for a more advanced version which handles merging document forms (acro forms) also.
*
* Run as: go run pdf_merge.go output.pdf input1.pdf input2.pdf input3.pdf ...
*/
package main
import (
"fmt"
"os"
"github.com/unidoc/unipdf/v3/common/license"
"github.com/unidoc/unipdf/v3/model"
"github.com/unidoc/unipdf/v3/model/pdfa"
)
func init() {
// Make sure to load your metered License API key prior to using the library.
// If you need a key, you can sign up and create a free one at https://cloud.unidoc.io
err := license.SetMeteredKey(os.Getenv(`UNIDOC_LICENSE_API_KEY`))
if err != nil {
panic(err)
}
}
func main() {
if len(os.Args) < 4 {
fmt.Printf("Requires at least 3 arguments: output_path and 2 input paths\n")
fmt.Printf("Usage: go run pdf_merge.go output.pdf input1.pdf input2.pdf input3.pdf ...\n")
os.Exit(0)
}
outputPath := ""
inputPaths := []string{}
// Sanity check the input arguments.
for i, arg := range os.Args {
if i == 0 {
continue
} else if i == 1 {
outputPath = arg
continue
}
inputPaths = append(inputPaths, arg)
}
err := mergePdf(inputPaths, outputPath)
if err != nil {
fmt.Printf("Error: %v\n", err)
os.Exit(1)
}
fmt.Printf("Complete, see output file: %s\n", outputPath)
}
func mergePdf(inputPaths []string, outputPath string) error {
pdfWriter := model.NewPdfWriter()
// Apply PDF/A-1a Standard with default options
pdfWriter.ApplyStandard(model.StandardApplier(pdfa.NewProfile1A(pdfa.DefaultProfile1Options())))
for _, inputPath := range inputPaths {
pdfReader, f, err := model.NewPdfReaderFromFile(inputPath, nil)
if err != nil {
return err
}
defer f.Close()
numPages, err := pdfReader.GetNumPages()
if err != nil {
return err
}
for i := 0; i < numPages; i++ {
pageNum := i + 1
page, err := pdfReader.GetPage(pageNum)
if err != nil {
return err
}
err = pdfWriter.AddPage(page)
if err != nil {
return err
}
}
}
fWrite, err := os.Create(outputPath)
if err != nil {
return err
}
defer fWrite.Close()
err = pdfWriter.Write(fWrite)
if err != nil {
return err
}
return nil
}
Hi @rcosta-ripcord thanks for providing more detail regarding this.
We investigate this issue.
Hi @rcosta-ripcord and @sagar-kalburgi-ripcord,
We are trying some experiment on PDF/A process, we tried to use standard font available when couldn't get the embedded font from PDF, here's the current results.
What do you think, do the results acceptable and not affecting your current use case?
Hi @sampila, would it be possible to show what the result would look like with the document @sagar-kalburgi-ripcord attached? I'd also like to know if there's any PR available we can test with
Hi @sampila, would it be possible to show what the result would look like with the document @sagar-kalburgi-ripcord attached? I'd also like to know if there's any PR available we can test with
Hi, the output1.pdf is from the S19-1026-NLP-Tasks.pdf
I will create PR for this specific issue.
@sampila Sounds good. We can test against your PR and let you know if it works out for us. Thanks!
Hi @sagar-kalburgi-ripcord and @rcosta-ripcord I created the PR and mentioned this issue on PR, could you check that?
Hi @sampila we were unable to find any PR linked to this issue. Could you pls post a link to it here?
Hi @sampila we were unable to find any PR linked to this issue. Could you pls post a link to it here?
The PR can be accessed through ripcord account that has been added into unipdf source code repository, you can access the PR using that account.
Hi @sampila, neither of us are able to find any PR although both of us are logged into our Ripcord account on Github
Hi @sagar-kalburgi-ripcord could you check again? you account should having access to the PR already. You can fork that.
Hi @sampila. I got access to your Org, however is it possible to add @rcosta-ripcord to your Org as well? he is actively testing these changes right now.
Hi @sampila. I got access to your Org, however is it possible to add @rcosta-ripcord to your Org as well? he is actively testing these changes right now.
Regarding that, @rcosta-ripcord can fork from your forked repo, as currently we are giving the access to 1 member of organization only.
Hi @sampila, @sagar-kalburgi-ripcord and I just tested your PR and it does fix our issue. Please let us know once you merge and release it so we can update the dependency on our services!
Thank you for your help!
Hi @sampila, @sagar-kalburgi-ripcord and I just tested your PR and it does fix our issue. Please let us know once you merge and release it so we can update the dependency on our services!
Thank you for your help!
Hi @rcosta-ripcord, thanks for confirmation, we are adding this issue into our test cases and preparing new UniPDF release. Will notify you after the release
Hi @sagar-kalburgi-ripcord and @rcosta-ripcord,
We released new UniPDF version to fix this issue https://github.com/unidoc/unipdf/releases/tag/v3.56.0
We are closing this issue for now and you can re-open the issue if at latest version not resolve this issue.
Best regards, Alip
Description
Hi, when I use unipdf to merge the attached PDF file with another PDF file, it throws this error
one of the font objects syntax is not valid - BaseFont undefined: Dict(\"BaseFont\": DejaVuSans, \"CharProcs\": IObject:567, \"Encoding\": Dict(\"Differences\": [46, period, 48, zero, one, two, three, four, five, six, seven, eight, nine, 75, K, 77, M, 97, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, 119, w, 121, y], \"Type\": Encoding, ), \"FirstChar\": 0, \"FontBBox\": [-1021, -463, 1794, 1233], \"FontDescriptor\": IObject:604, \"FontMatrix\": [0.001000, 0, 0, 0.001000, 0, 0], \"LastChar\": 255, \"Name\": DejaVuSans, \"Subtype\": Type3, \"Type\": Font, \"Widths\": IObject:605, )
But Adobe reader and Chrome PDF reader are able to render the PDF document without reporting any font related issues at all. So not sure why only unipdf is running into this. It may be that the document itself has the font configured incorrectly, but Adobe reader and Chrome have no problem rendering it correctly at all.
Expected Behavior
Unipdf needs to handle the merge seamlessly.
Actual Behavior
Use unipdf merge functionality using the attached PDF file and another PDF file of your choice to reproduce the error.
Attachments
S19-1026-NLP-Tasks.pdf