UglyToad / PdfPig

Read and extract text and other content from PDFs in C# (port of PDFBox)
https://github.com/UglyToad/PdfPig/wiki
Apache License 2.0
1.73k stars 241 forks source link

Support opacity for colors? #368

Closed theolivenbaum closed 1 year ago

theolivenbaum commented 3 years ago

Hi and great work with the library! I'm migrating some code to PdfPig and was wondering if it is possible to add a transparent or semi-transparent color? We used it to render invisible text over an image, and to render semi-transparent bounding boxes.

I saw that SetNonStrokeColorDeviceRgb only supports RGB, but would be great to have ARGB support too!

theolivenbaum commented 2 years ago

hi @EliotJones, any idea on this one? happy to send a PR if you could give some guidance on where to start!

EliotJones commented 2 years ago

Hi sorry I forgot to reply to this, I had a look through the spec and couldn't see anything obvious for supporting transparency simply. It looks remarkably complicated, it's a whole chapter of the spec starting from page 320 here https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf

theolivenbaum commented 2 years ago

Hi @EliotJones - no worries! I know how it is :)

Thanks for checking it - perhaps the solution for what I'm looking for is something else than transparency - what I wanted was a way to add "invisible text" on top of an image (think a scanned PDF that we add the results of an OCR model).

It seems like this is supported by the spec with some kind of rendering mode option:

image

Can one set this while adding text to a page?

Cheers and thanks again for the great work!

JOT85 commented 2 years ago

I've implemented this privately, I'd love to contribute this feature if you're open to it @EliotJones?

If so, which public API would you prefer? A direct SetTextRenderingMode method which adds a SetTextRenderingMode to the content stream (applies to all future text until called again), an AddInvisibleText method to add invisible text and then reset the rendering mode, or both?

EliotJones commented 2 years ago

Hi @JOT85 thanks, I think SetTextRenderingMode is probably preferable to give direct control to consumers. I'd just maybe include a doc-comment to the effect of "use mode 3 to add invisible text" or similar, maybe the method parameter should be an enum:

enum RenderingMode {
   FillText = 0,
   // etc
}

there might already be one internally that could be publicly exposed.