ststeiger / PdfSharpCore

Port of the PdfSharp library to .NET Core - largely removed GDI+ (only missing GetFontData - which can be replaced with freetype2)
Other
1.08k stars 237 forks source link

Draw a PNG image with transparency #41

Closed jjavierdguezas closed 5 years ago

jjavierdguezas commented 5 years ago

Hi, first thanks for port this to .NET Standard I'm having problems reading a pdf file and trying to draw a png image loaded from a file this is my code

using System.IO;
using PdfSharpCore.Drawing;
using PdfSharpCore.Pdf;
using PdfSharpCore.Pdf.IO;

namespace PdfTest
{
    class Program
    {
        static void Main(string[] args)
        {
            if (File.Exists("edited.pdf"))
                File.Delete("edited.pdf");

            PdfDocument document = PdfReader.Open("original.pdf", PdfDocumentOpenMode.Modify);
            var page = document.Pages[0];
            var gfx = XGraphics.FromPdfPage(page);
            // var image = XImage.FromFile("Firma2.jpg");
            var image = XImage.FromFile("Firma.png");
            double width = image.PixelWidth * 4.5 / image.HorizontalResolution;
            double height = image.PixelHeight * 4.5 / image.HorizontalResolution;
            gfx.DrawImage(image, 51, page.Height - 140, width, height);
            document.Save("edited.pdf");
        }
    }
}

When I use the Nuget Package PdfSharpCore v1.1.8 a black rectangle is drawn image I downgraded versions and it occur in all of them

When I use a jpg file, it is drawn ok, but jpg has no transparency and I need that

I reproduce the same piece of code with the same data using the original Nuget Package PDFsharp v1.50.5147 (using .NetFramework v4.7) and it is drawn correctly

Here is the data I am using for tests: edited.pdf Firma.png Firma2.jpg original.pdf

jjavierdguezas commented 5 years ago

Reading in other issues, I saw that the nuget I mentioned is not yours, so I downloaded your source code, I built it and I used it with the same sample and the output was the same, a black rectangle. Maybe I'm using the library incorrectly... please, any help is welcome

jjavierdguezas commented 5 years ago

@ststeiger help please?

jjavierdguezas commented 5 years ago

@Sappharad can you please help me?

jjavierdguezas commented 5 years ago

please confirm me if PNG is supported, I think that maybe JPEG is only supported and the black is just the default background color. Or if it is just that I need to do something with the image lib

Sappharad commented 5 years ago

The PDF format does not support PNG natively. PDF only supports JPEG related formats internally: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf See chapter 7.4.9

In the original PDFSharp, all images are converted into JPEG before being embedded into the document. It is a common practice in the printing industry to "flatten" transparent elements before putting them into a PDF because you can't tell a printer to print part of a color, the look of the document at each point needs to be calculated ahead of time so it can be printed. Since PDF is designed to represent printable documents, this is why transparency is not a native feature for images. They do have masking features that can be used for non-rectangular images, but I don't think PDFSharp has that.

As to why PNG does not work for you, I haven't looked at your images and can't right now. I would convert them to JPEG since that's what will end up inside the PDF anyway.

jjavierdguezas commented 5 years ago

thanks @Sappharad for your answer (I already believed that nobody was going to answer me), but, with the original package PdfSharp PNG images with transparency works OK, in fact PdfSharpNetStandard that is also from @ststeiger works too, but I can't use it due to the use of System.Drawing. I think that it is maybe a bug in this particular lib or I'm using it incorrectly.

jjavierdguezas commented 5 years ago

When I say that it works well, I mean that the image is drawn ok and with transparency (that is, if I put it on top of text, it does not hide the text). I need it to be png, because I need to embed a signature in a document, that's why I need transparency

jjavierdguezas commented 5 years ago

I am debugging and tracing the flow of my example and everything I see is relative to JPEG (even if the loaded image is a PNG and the library that reads it notes it that way, I feel that the JPEG is forced) In this line I see something related to images that are not JPEG, but for some reason it does not enter there (although the library that loads the image recognizes that it is PNG)

Changing some things to try to get there and I only get empty images or a rectangle filled with a color like violet

I think this library does not support PNG or something I'm doing wrong, I do not know what the others libraries as PdfSharp and PdfSharpNetStandard do different but with them it work

Sappharad commented 5 years ago

Based on the code that you linked, I agree that it looks like PNG hasn't been implemented yet. I haven't attempted to reproduce your problem though, so this is just by observation.

ImageSharp is capable of loading and creating PNG, I have used it in another project for that purpose. It also is capable of converting your PNG into a RAW RGBA image which that code appears to need. You might need to contribute that logic yourself though if nobody else has time to look at it.

If I were actually trying to use it for PNG, I might be willing to add that for you but I'm not at the moment.

jjavierdguezas commented 5 years ago

I truly hope that someday @ststeiger read this issue and say something... I just give up, I moved my logic to an app where I can use PdfSharpNetStandard, so thank you for that lib!šŸ˜…

ststeiger commented 5 years ago

@jjavierdguezas: I have read it one day after you created it. Sorry, no time at the moment.

By the way, I didn't know PDF only supports JPEG internally. That explains a few of the issues I've been having with images in PDF-exports in ReportServer. Thanks to @Sappharad for that piece of information.

@jjavierdguezas: As Sappharad said, you could always just manually convert your image to jpg with ImageSharp. Obviously, you need to determine which color the transparent data will have. Since you're printing it on white, that's easy to say. So I don't much see where your problem is (apart from the possible lack of ImageSharp sample code). You now know all that needs to be done. I'm not writing your program for you.

And maybe if I have time, I will add this "feature" - which seems rather dangerous IMHO- easy to get wrong. A more informative error message would be more appropriate, though, and an image format conversion example couldn't hurt either.

Worst case you could always do something along the lines of loop for each x, for each y in oldImage => NewImage.SetPixel(x,y, ConvertRgbaToRgb(oldImage.GetPixel(x,y)) ), or something like that.

jjavierdguezas commented 5 years ago

hello @ststeiger, I NEVER intended you to write the code for me, in fact, my code is exactly the one I put above no more nor less. Having said that, my English probably is not very good and you did not understand me. I need transparency because I need to embed a signature in a document that can have text, so not a white background or any other color is useful, the signature can be on top of text as in real life but it should not be hidden more than necessary. Do you understand? if PDF does not accept PNG, ok I accept it, but for some reason the other library, yours, works (luckily for me). thanks for everything.. download_20190613_093732

ststeiger commented 5 years ago

Hm, this is strange - this shouldn't work if it only uses jpeg-internally. Unless the PDF-Renderer treats white as transparent or something like that. I'll look into what they do differently when I have the time.

Sappharad commented 5 years ago

I forgot to mention this and it's probably important...

While PDF doesn't support PNG natively, per the link I provided to the PDF spec it does support JPEG2000 which supports transparency. I don't know if that's how it is done when transparency works, but converting images to JPEG2000 might be the solution when you want to preserve transparency. It's still not recommended to have transparent elements in a PDF if you want to actually print the PDF on a printer, but there's at least some way to do it.

GDI+ doesn't do JPEG2000 so I don't think this is how it works in the GDI version. I wonder how that does work...

jjavierdguezas commented 5 years ago

@Sappharad So, are you saying that my modified pdf that looks so good in my computer when I print it maybe will not looking well ? šŸ˜¬

ststeiger commented 5 years ago

@jjavierdguezas: Yea - you'll just have to print it to know for sure. I'm sure your company has a printer somewhere - Alternatively, if you can upload the pdf somewhere (without virus), my company has one, and I can try at lunchbreak tomorrow ;)

jjavierdguezas commented 5 years ago

@ststeiger jajaja I truly don't know if you are joking or being serious šŸ˜…... I'm at home now I don't have a printer near, tomorrow at work I'm going to print some files... wish me luck cause if they don't print ok I'll be in the square one again šŸ˜¬

Sappharad commented 5 years ago

The transparency concern won't really matter to you, I just brought it up because it's a reason why it's discouraged for PDF. With the type of transparency you're using (0% / 100%) it doesn't usually matter. The transparency problem becomes an actual concern to people picky about color when you need blending, like partially transparent for an image of a glass container or something. PNG images are RGB, but JPEG is usually CMYK or YCbCr with a profile embedded like sRGB so it knows what conversion to use. CMYK (Cyan, Magenta, Yellow, Black) happens to be colors of ink / toner most color printers which is probably why PDF prefers JPEG too.

The problem is that if you have transparency, you're blending images converted from a different color space and the results might be different than if your computer was blending RGB programmatically. For your example, it's 100% or 0% transparent so color blending doesn't actually matter.

Knowing this doesn't really help you, but I thought it might be interesting to explain.

jjavierdguezas commented 5 years ago

well, I printed some modified pdf files and they look good, thanks to God, @Sappharad and @ststeiger (not in that order necessarily šŸ˜‰)

Sappharad commented 5 years ago

Check out PR 45 - I wouldn't call this a great implementation because it's just enough to get PNG transparency to work. I also have not re-tested JPEG yet to make sure I didn't break anything https://github.com/ststeiger/PdfSharpCore/pull/45

Will follow up once I've confirmed everything looks good on my end.

Someone identified a scenario where we used PNGs with transparencies and since it was about half a day to fix it, I was able to go and do that.

Sappharad commented 5 years ago

My original PR 'broke' things and was detecting everything as PNG and embedding it as a PDF Bitmap. This was because ImageSharp was internally loading everything as RGBA32 and I was using color space to detect transparency.

I fixed that, and now everything goes JPEG except actual PNG which is correctly detected by format and converted to PDF Bitmap with transparency. For PNG, this is equivalent to how GDI PdfSharp was doing it. Despite being called Bitmap, PDF Bitmap is still compressed and in some cases will be smaller than the original PNG so you don't need to worry about massive file sizes from this change.

It seems to work correctly in mixed format scenarios now and I'm satisfied with it. Let me know if you notice any problems.

ststeiger commented 5 years ago

@Sappharad: Merged. Thanks.

jjavierdguezas commented 5 years ago

I tested it and it works well, the code that I put in this issue works fine too so I'm closing this issue. Thanks @Sappharad

alirazafalconit commented 3 years ago

Nice Work @ststeiger