sungaila / PDFtoImage

A .NET library to render PDF files into images.
https://www.sungaila.de/PDFtoImage/
MIT License
152 stars 15 forks source link

AOT issue #14

Closed Bxaa closed 1 year ago

Bxaa commented 2 years ago

Thx for project! Sadly issue: PDFtoImage.Conversion.GetPageCount and PDFtoImage.Conversion.ToImage cause deadlock (permanent loop) after AOT compilation (Tested with all version AOT native)

sungaila commented 2 years ago

Hi @Bxaa,

what kind of AOT do you mean? Do you mean ReadyToRun? Do you mean the new AOT from the .NET 7 previews?

Can you give me a sample project to test this myself?

Bxaa commented 2 years ago

I mean Native AOT (https://github.com/dotnet/runtimelab/tree/feature/NativeAOT) Used 6.0.0-rc.1.21373.1 (other versions with parallel invoke implementation\type issue)

FOR TEST: (vs2022) https://drive.google.com/file/d/16pLGLsXNfGq0zZYd-wLDqELkn48neVB4/view?usp=sharing

Need check all your pinvoke's: I think issue can be fixed by <MarshalAs(UnmanagedType.LPWStr)>for strings with CharSet:=CharSet.Unicode for some pinvokes MarshalAs(UnmanagedType.LPArray)> and <MarshalAs(UnmanagedType.Bool)> ,etc Also: need check some pinvoke return value (i got issue with null terminated strings for some pinvokes and AOT. Possible we have same issue) (return values is some String + chr(0) + chr(0)+ chr(0)+ chr(0)+ chr(0)...etc) So need trim string end with chr(0)

//Just example:
This declaration crash app after AOT
 <DllImport("Kernel32.dll", EntryPoint:="GetFinalPathNameByHandle", SetLastError:=True, CharSet:=CharSet.ANSI)>
        Public Shared Function GetFinalPathNameByHandle(ByVal hFile As IntPtr,
        ByVal lpszFilePath As StringBuilder, ByVal cchFilePath As UInteger, ByVal dwFlags As UInteger) As UInteger
        End Function
---
//Correct declaration is:
 <DllImport("Kernel32.dll", EntryPoint:="GetFinalPathNameByHandle**W**", SetLastError:=True, CharSet:=CharSet.**Unicode**)>
        Public Shared Function GetFinalPathNameByHandle(ByVal hFile As IntPtr,
        <**MarshalAs(UnmanagedType.LPWStr)**> ByVal lpszFilePath As StringBuilder, ByVal cchFilePath As UInteger, ByVal dwFlags As UInteger) As UInteger
        End Function

Thx

Bxaa commented 2 years ago

Don't forget copy native dll's to native exe And.. yes, all SkiaSharp.dll functions work fine after AOT except PDFtoImage.Conversion.GetPageCount and PDFtoImage.Conversion.ToImage (Same issue with PDFtoPNG nuget)

sungaila commented 2 years ago

Thanks for providing additional information. I'll be looking into it!

sungaila commented 2 years ago

I checked the crash dump and found that pdfium.dll is crashing with the exception code 0xC0000005 (access violation reading 0xFFFFFFFFFFFFFFFF). This happens when the PDF document is loaded by calling the native method FORM_DoDocumentOpenAction.

Since I haven't much experience in unmanaged C development (and not having the debug symbols for PDFium 105.0.5131 ready), I will give up further investigations here. My guess is that something is wrong in PDFium or even NativeAOT.

I'll gladly accept your pull request, @Bxaa, if you succeed in fixing this issue!

Bxaa commented 2 years ago

Thx, for testing. I'll try to investigate it. I'll try to figure it out (it will be pretty hard to do...) I'll let you know if I succeed.

Bxaa commented 2 years ago

investigating (unmanaged PDFium + AOT cause access violation reading)

Bxaa commented 2 years ago

Good news. I was able to fix it It's pretty simple and will always work I'll write the code here later (I'll check again)

Bxaa commented 2 years ago

You need pdfium.dll 4648.0.0 (native PDFium.x64 without V8 or XFA support) and x64 libSkiaSharp.dll (look in nuget both) (AOT only support x64) Both should be in same folder with native compiled exe -or- You can include dll's to native exe (with some tricks)

That's all :)

Bxaa commented 2 years ago

@sungaila

static NativeMethods()
        {
            // Load the platform dependent Pdfium.dll if it exist.
            var workingDirectory =
                Assembly.GetExecutingAssembly().GetName(false).CodeBase
                ?? Process.GetCurrentProcess().MainModule!.FileName!;

            LoadNativeLibrary(Path.GetDirectoryName(new Uri(workingDirectory).LocalPath)!);
        }

In this case all dlls search redirects ignored and possible cause issues with .NET deployment and custom settings for assemblies and native dll's embedded deployment

For Windows NETFRAMEWORK and NETCOREAPP3_1_OR_GREATER you should use Dynamic-Link Library Search Order for native DLL (GetDllDirectory max priority):

---\\---\\---
string workingDirectory = Path.GetFullPath(Assembly.GetExecutingAssembly().GetName(false).CodeBase ?? Process.GetCurrentProcess().MainModule!.FileName!);
            StringBuilder tmp = new StringBuilder(32767); //32767 long path
            if (NativeMethods.GetDllDirectoryW(32767, tmp) == true && tmp.ToString() != string.Empty) //Return empty string if not set
            {
                if (File.Exists(Path.Combine(Path.GetFullPath(tmp.ToString()), "pdfium.dll")))
                {
                    workingDirectory = Path.GetFullPath(tmp.ToString());
                }
            }
            LoadNativeLibrary(Path.GetDirectoryName(new Uri(workingDirectory).LocalPath)!);
---\\---\\---
public class NativeMethods
{
    [DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    static public extern Boolean GetDllDirectoryW(int nBufferLength, [MarshalAs(UnmanagedType.LPWStr)] StringBuilder lpPathName);
}

Also you can add check path from Environments DOTNET_BUNDLE_EXTRACT_BASE_DIR as second in order

sungaila commented 2 years ago

You need pdfium.dll 4648.0.0 (native PDFium.x64 without V8 or XFA support) and x64 libSkiaSharp.dll (look in nuget both) (AOT only support x64) Both should be in same folder with native compiled exe -or- You can include dll's to native exe (with some tricks)

This did not work for me. I used those native libs and the program will still crash:

Bxaa commented 2 years ago

Hi,

This did not work for me. I used those native libs and the program will still crash:

@sungaila you mean my test sample or are you compiling something else? Try this: This is compiled Test_AOT.exe + some needed dll's for launching in sandboxes. You can launch it in the windows 10 regular sandbox for safety :) https://drive.google.com/file/d/1CtCgE1eVkGgr4Dn2YxNPToORH4j5xFwZ/view?usp=sharing Is it work? (you should get "Success: page count - 4")

@sungaila, pls, make a fix for Dynamic-Link Library Search Order (Which I described above and use GetDllDirectory pinvoke to get dll's path) Your current version ignore all pinvoke's like SetDefaultDllDirectories, SetDllDirectoryW, AddDllDirectory called from executable Example: IncludeNativeLibrariesForSelfExtract flag (it used $HOME/.net or %TEMP%/.net, not executable path)

Just replace your code:

 {
            var workingDirectory =
                Assembly.GetExecutingAssembly().GetName(false).CodeBase
                ?? Process.GetCurrentProcess().MainModule!.FileName!;
            LoadNativeLibrary(Path.GetDirectoryName(new Uri(workingDirectory).LocalPath)!);
        }

With code from my post

Use SetDllDirectoryW("Some path...") pinvoke in main app on launch [DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)] [return: MarshalAs(UnmanagedType.Bool)] static public extern Boolean SetDllDirectoryW([MarshalAs(UnmanagedType.LPWStr)]string lpPathName); Example you can use windws' temp' folder for deploy native dll's

Also: Possible you should use ComWrappers.RegisterForMarshalling(WinFormsComInterop.WinFormsComWrappers.Instance) (Add this to first called method) https://github.com/kant2002/winformscominterop (look nuget WinFormsComInterop)

 public Form1()
        {
            InitializeComponent();
            ComWrappers.RegisterForMarshalling(WinFormsComInterop.WinFormsComWrappers.Instance);

UPD: Com wrapper version: https://drive.google.com/file/d/1i3s2ZuuAONe3eWj2F5orYrs2GA0A40ib/view?usp=sharing

Bxaa commented 2 years ago

@sungaila Hi, https://drive.google.com/file/d/1txkmoSjA7ttLGkFTwNUnDbzR1XGn2exS/view?usp=sharing This is fix for your NativeMethods.cs with AOT + custom DLL's path (just replace it in your repository)

Also i include to archive v_Comparator.exe (fully native exe with your pdftoimage assembly for test) (Single exe)

Bxaa commented 2 years ago

@sungaila (offtop)

**public** sealed class PdfException : Exception
    {
        **public** PdfError Error { get; private set; }

---\---\--- and

**public** enum PdfError
    {
        Success = (int)NativeMethods.FPDF_ERR.SUCCESS,
        Unknown = (int)NativeMethods.FPDF_ERR.UNKNOWN,
---\\---\\---

This will make possible normally handle error from main app (PDFtoImage.PdfiumViewer.PdfException)

Example: PdfError.PasswordProtected

Bxaa commented 2 years ago

@sungaila Possible bug passworded pdf's 'Password required or incorrect password'

Reproduce: PDFtoImage.Conversion.GetPageCount(Base_64_String_Pdf_String, "test")

Sample PDF (password: test) https://drive.google.com/file/d/1QF2U0o98qZ5U1G0rhCfrq0Va3hOvhZBg/view?usp=sharing

(Maybe it's just me having a glitch with the AOT or pdfium.dll version)