ArgusMagnus / PDFiumSharp

.NET wrapper around Google's PDFium library
Other
168 stars 60 forks source link

Missing text when rendering to a bitmap #30

Open seanf711 opened 1 year ago

seanf711 commented 1 year ago

I have PDFs that I need to convert to images. They successfully convert, but some of the PDFs will be missing text. It seems like PDFs that are fillable, the text that would be filled in does not come out when rendering to a bitmap. I tried adding the enum for Annotations thinking maybe fillable text was considered an annotation, but that did not fix the problem. I am sure I must be missing something, but could not find it. This is an example of a PDF converted to an image. Al of the spots for name etc should have text and did on the original PDF. imagegrab

bradleypeet commented 8 months ago

Did you specify the PDFiumSharp.RenderingFlags.Annotations flag when rendering?

seanf711 commented 8 months ago

Did you specify the PDFiumSharp.RenderingFlags.Annotations flag when rendering?

This is the line of code I am using to render to an image: page.Render(bitmap,PDFiumSharp.Enums.PageOrientations.Normal, PDFiumSharp.Enums.RenderingFlags.Annotations | PDFiumSharp.Enums.RenderingFlags.Printing);

bradleypeet commented 8 months ago

Apparently "annotations" are not the same thing as fillable form fields (which I believe are either AcroForm or XFA). After digging around last night, I discovered the same issue came up many years ago in the PdfiumViewer project and the author had to initialize the forms engine immediately after document load and make some extra calls to render the filled in form fields. I'll be working on this today. First I will check forks of this project to see if anyone else has already fixed the issue. If not I'll likely implement the forms support in a fashion similar to the PdfiumViewer project.

seanf711 commented 8 months ago

Apparently "annotations" are not the same thing as fillable form fields (which I believe are either AcroForm or XFA). After digging around last night, I discovered the same issue came up many years ago in the PdfiumViewer project and the author had to initialize the forms engine immediately after document load and make some extra calls to render the filled in form fields. I'll be working on this today. First I will check forks of this project to see if anyone else has already fixed the issue. If not I'll likely implement the forms support in a fashion similar to the PdfiumViewer project.

That would be great if you could get that added in.

bradleypeet commented 8 months ago

I created my own fork of PDFiumSharp (based on another fork that has a had recent work done on it) and pushed the changes up today to get annotations working, including filled in forms. Keep in mind this fork expects build 5921 of the pdfium native binaries (from July of this year). https://github.com/bradleypeet/PDFiumSharp

seanf711 commented 8 months ago

I created my own fork of PDFiumSharp (based on another fork that has a had recent work done on it) and pushed the changes up today to get annotations working, including filled in forms. Keep in mind this fork expects build 5921 of the pdfium native binaries (from July of this year). https://github.com/bradleypeet/PDFiumSharp

This is great. Thanks for putting the work in to it. Hopefully I am not the only one needing this particular feature.