sungaila / PDFtoImage

A .NET library to render PDF files into images.
https://www.sungaila.de/PDFtoImage/
MIT License
176 stars 19 forks source link

Big file sizes generate blank picturse #18

Closed SimonG5 closed 1 year ago

SimonG5 commented 1 year ago

When you upload pdf files over 3kb the returned image turns blank. Can this be some buffer issue? `

        IAsyncEnumerable<SKBitmap> images = Conversion.ToImagesAsync(data,null,100);
        await foreach (SKBitmap image in images)
        {
            var encoded = image.Encode(SKEncodedImageFormat.Jpeg, 100);
            imageList.Add(Convert.ToBase64String(encoded.ToArray()));
        }
        return new OkObjectResult(imageList);`

the variable data is a base64 string. Operating os is windows 11.

sungaila commented 1 year ago

Hi @SimonG5,

could you please provide a sample PDF file for this issue?

I tested your code above with this test file Wikimedia_Commons_web.pdf (7.88 MB) and the generated JPG image was fine.

SimonG5 commented 1 year ago

Hi @sungaila

Thanks for the quick response. I used this file here and it generates a blank pdf but when I send this PDF I get working jpgs. I am sending the pdfs as a base64 string from frontend to a dotnet 6 azure function, and returning a list of base64 strings.

sungaila commented 1 year ago

Both PDF files render just fine on my machine. Can you please confirm if the JPG files are correct by saving and opening them in an image viewer?

IAsyncEnumerable<SKBitmap> images = Conversion.ToImagesAsync(data,null,100);
await foreach (SKBitmap image in images)
{
    var encoded = image.Encode(SKEncodedImageFormat.Jpeg, 100);
    imageList.Add(Convert.ToBase64String(encoded.ToArray()));

    using var output = new System.IO.FileStream(System.IO.Path.Combine(System.Environment.CurrentDirectory, "output.jpg"), System.IO.FileMode.OpenOrCreate, System.IO.FileAccess.Write);
    encoded.SaveTo(output);
    output.Close();
}
return new OkObjectResult(imageList);
SimonG5 commented 1 year ago

Saving the image generated this blank pictures. output

sungaila commented 1 year ago

Which runtime are you using? .NET 6? Are you running Windows 11 on x64 or ARM64? Are you using PDFtoImage 2.1.2? Are the PDFium binaries (from the NuGet package bblanchon.PDFium.Win32) on version 105.0.5187?

Have you tried other encoders like SKEncodedImageFormat.Png or SKEncodedImageFormat.Bmp?

Just asking a bunch of things for troubleshooting. :-)

SimonG5 commented 1 year ago

I ended up figuring out that my base64 converter on the front end was at fault. This can be marked as resolved, the library worked as intended. Thank you for your fast help!

sungaila commented 1 year ago

Glad to hear that!

angelru commented 1 year ago

@sungaila

    var base64Images = new List<string>();

        var sKBitmaps = PDFtoImage.Conversion.ToImagesAsync(signeFileRequest.File, null, dpi: 100);

        await foreach (var sKBitmap in sKBitmaps)
        {

            using var sKData = sKBitmap.Encode(SkiaSharp.SKEncodedImageFormat.Jpeg, 70);
            base64Images.Add(Convert.ToBase64String(sKData.ToArray()));
        }

It always generates the same base64 for all the pages, if I save the images everything is correct... any ideas?

sungaila commented 1 year ago

@angelru Are you sure the Base64 are identical? In my test case the first 100 characters are equal but they differ after that.

angelru commented 1 year ago

@sungaila you are right, I looked very quickly, thanks for this wonderful library

angelru commented 1 year ago

@sungaila Do the height and width of the images correspond to those of the pdf pages?