Open Noodle56 opened 11 years ago
I'm also seeing similar behavior.
Same here
Have the same problem... but only randomly. On the exact same html. It always works the first time but if you leave it for a few minutes and come back, the same call results in 100% CPU for that thread.
Here is the logfile: Working: http://textdump.net/raw/879/ Not Working: http://textdump.net/raw/878/ (again: this is the exact same request, with the exact same html & config) using SynchronizedPechkin
And me here. Using SimplePechkin 0.5.8.1 downloaded via NuGet. Get same behaviour in both ASP.NET MVC app and LinqPad. Only just come across the problem so not investigated any further myself yet.
update Thanks to Luke Baughan for pointing out that SynchrionizedPechkin is the version for use where requests may come from multiple threads. Downloading Pechkin.Synchronized nuget package and updating my references fixed my prob (which was not same as OPs, but I'll leave this comment in case it helps anyone).
I think the excessive CPU usage is a sign of broken synchronization. I. e. when I use SynchronizedPechkin it goes fine, but when I substitute it with SimplePechkin it hangs the second time code runs.
So, I guess it's my buggy SynchronizedPechkin (or you're using SimplePechkin somewhere else in the code, just double check references to be sure).
If you can reproduce this thing on your workstation just do the following:
Also, do you use any event handlers?
I'm not using event handlers. My original problem was indeed because I wasn't using SynchronizedPechkin
. However, I've noticed that if I make a change to code in VS2010, recompile and run (using built-in web server) it then hangs unless I bounce the web server so there's something not quite right.
It hangs in the call to .Convert(...)
(see here. The only other relevant thread stack is here
Yeah, I think I need to add checks into the SimplePechkin. That way it'll be garanteed that it's not used from another threads and every user will get understandable explanation about what to do. I'll make changes shortly.
Same issue. This isn't a multiple thread issue. The "synchronized" wrapper is supposed to ensure that no more than one thread access the native libraries at the same time. That's fine. But the problem described happens when the libraries are called at separate times. I.e. PDF rendered -> converter disposed -> pause -> another PDF rendering initiated -> hangs. From my understanding of your design, the following should work provided no more than one thread execute that block of code simultaneously:
Dim pechkin As Pechkin.SimplePechkin = Nothing
Try
Dim confirmationHtml As String = GetConfirmationhtml(confirmationId)
Dim globalConfig As New Pechkin.GlobalConfig()
pechkin = New Pechkin.SimplePechkin(globalConfig)
Dim docConfig As New Pechkin.ObjectConfig()
docConfig.SetCreateExternalLinks(True)
docConfig.SetLoadImages(True)
docConfig.SetCreateInternalLinks(True)
docConfig.SetPrintBackground(True)
Dim bytes As Byte() = pechkin.Convert(docConfig, confirmationHtml)
Response.ContentType = "application/pdf"
Response.AppendHeader("Content-Disposition", "attachment; filename=Confirmation.pdf")
Response.OutputStream.Write(bytes, 0, bytes.Length)
Response.Flush()
mBypassPageRender = True
Catch ex As Exception
Finally
If pechkin IsNot Nothing Then
pechkin.Dispose()
End If
End Try
I also should add, that the assembly was compiled under .NET 2.0. Yes, I know it's horrible, but I'm constrained to use that version, unfortunately. Sometimes it's easier to climb Everest than convince the management to upgrade (no value to the shareholders).
So this could be the problem, of course, but hard to say. I had to throw out anything to do with the logging component. And changed one line in SimplePechkin, because it wasn't compiling in .NET 2.0:
Oridinal: Marshal.WriteByte(buffer + strbuf.Length, 0); Changed: Marshal.WriteByte(new IntPtr(buffer.ToInt64() + (long)strbuf.Length), 0);
Allright, I think I understand that the problem isn't in simultaneous access, but rather thread affinity.
Correct me if I'm wrong: Once a thread accesses the native libraries, the native code loaded may only be executed again on that same thread only for the duration of the process lifetime, otherwise the process will hang.
By the way, I'm new to Github, so I didn't find where I could leave a message of appreciation. This is the best PDF converter and you took time to write a wrapper for it. Many many thanks for that. You're saving people so much time. This is a very very useful component. Awesome work! Spending your free time to share this with someone. Legend.
Ok, got the thing working under .NET 2.0 by writing a simple wrapper (dedicated background thread and wait handles). Damn legacy VB.NET projects that need to be maintained. Thank god most of my time is spent developing in .NET 4 and C#.
Like I said before, it's a thread-affinity issue, rather than concurrency only.
Once again, many thanks to the author for figuring this whole thing out and making it available to the rest of us. Huge time saving. Very much appreciated.
@Kons Thank you for the words of appreciation :)
You've had the affinity issue all right, but if you see the implementation of SynchronizedPechkin, you'll find out that it's designed to solve this particular problem: it creates singleton thread and forwards all calls to that thread.
Perhaps, it's not only synchronized but I couldn't find right word for it)
Synchronized is the right word. Your implementation is very good. Great to see it written in C#. Structured and easy to follow. Works perfectly in .NET 4 environment. And it turned out to be easy to adapt for .NET 2 as well. I hate having to write code for .NET 2 these days - always try to convince people to upgrade instead, but what can you do.
Anyhow, this is the first time that I've seen a situation where a component would only work on the thread that first accessed it. So maybe this information will help someone else.
If anyone needs the VB.NET code that I used to handle synchronization in VB.NET, please let me know.
Ah - you guys are quick! Ok, wishing I'd been able to be more helpful on this.
@Kons:- if you can share your code in VB.net, I'm happy to repost it in C# for others; or @gmanny, will you be incorporating this into SynchronizedPechkin.
Also wanted to express my appreciation for this project. It's billion% better than the paid-for PDF generators I've used.
"It's billion% better than the paid-for PDF generators I've used." I second it. We've looked at several commercial ones so far (being a big company it's not a problem) to cover all our PDF conversion needs, but none worked out so well. Amazing.
There's no need to incorporate my solution into SynchronizedPechkin, because the latter handles everything very well already. My code is just for those poor souls, whose management is making them port stuff back to old versions of the framework.
Please see the code below (it was written in under an hour, but with reasonable care in mind). Once again, this is an alternate solution for those who have to use the old .NET:
Protected NotInheritable Class PechkinSync
Private Shared mThread As System.Threading.Thread
Private Shared mSync As Object = New Object()
Private Shared mBgThreadWaitHandle As New System.Threading.ManualResetEvent(False)
Private Shared mCallingThreadWaitHandle As New System.Threading.ManualResetEvent(False)
Private Shared mSource As String
Private Shared mResult As Byte()
''' <summary>
''' Not supposed to be instantiated.
''' </summary>
''' <remarks></remarks>
Private Sub New()
End Sub
Shared Sub New()
mThread = New System.Threading.Thread(AddressOf Run)
mThread.IsBackground = True
mThread.Name = "Pechkin_Thread"
mThread.Start()
End Sub
Private Shared Sub Run()
While True
mBgThreadWaitHandle.WaitOne()
mResult = Nothing
Dim pechkin As Pechkin.SimplePechkin = Nothing
Try
Dim globalConfig As New Pechkin.GlobalConfig()
pechkin = New Pechkin.SimplePechkin(globalConfig)
Dim docConfig As New Pechkin.ObjectConfig()
docConfig.SetCreateExternalLinks(True)
docConfig.SetLoadImages(True)
docConfig.SetCreateInternalLinks(True)
docConfig.SetPrintBackground(True)
mResult = pechkin.Convert(docConfig, mSource)
Catch ex As Exception
Finally
Try
If pechkin IsNot Nothing Then
pechkin.Dispose()
End If
Catch ex As Exception
End Try
mCallingThreadWaitHandle.Set()
End Try
mBgThreadWaitHandle.Reset()
End While
End Sub
Public Shared Function Convert(ByVal htmlSource As String) As Byte()
If String.IsNullOrEmpty(htmlSource) Then
Throw New ArgumentException("htmlSource - value cannot be null or empty")
End If
SyncLock mSync
mSource = htmlSource
mCallingThreadWaitHandle.Reset()
mBgThreadWaitHandle.Set()
mCallingThreadWaitHandle.WaitOne(20000)
mSource = Nothing
Dim result As Byte() = mResult
mResult = Nothing
Return result
End SyncLock
End Function
End Class
Also, I'm thinking about changing my screen name to Freemanny. Then maybe me and Gmanny can go have a party in the Black Mesa. :)))
@Kons - thanks bud :¬)
Here's a C# version if anyone needs it:
sealed class PechkinSync { private static System.Threading.Thread _pdfThread;
private static object _syncRoot = new object();
private static System.Threading.ManualResetEvent _pdfThreadWaitHandle = new System.Threading.ManualResetEvent(false);
private static System.Threading.ManualResetEvent _callingThreadWaitHandle = new System.Threading.ManualResetEvent(false);
private static Pechkin.GlobalConfig _config;
private static string _source;
private static byte[] _result;
/// <summary>
/// Not supposed to be instantiated.
/// </summary>
/// <remarks></remarks>
private PechkinSync()
{
}
static PechkinSync()
{
PechkinSync._pdfThread = new System.Threading.Thread(PechkinSync.Run);
PechkinSync._pdfThread.IsBackground = true;
PechkinSync._pdfThread.Name = "Pechkin_Thread";
PechkinSync._pdfThread.Start();
}
private static void Run()
{
while (true)
{
PechkinSync._pdfThreadWaitHandle.WaitOne();
PechkinSync._result = null;
Pechkin.SimplePechkin pechkin = null;
try
{
Pechkin.GlobalConfig globalConfig = new Pechkin.GlobalConfig();
pechkin = new Pechkin.SimplePechkin(globalConfig);
Pechkin.ObjectConfig docConfig = new Pechkin.ObjectConfig();
docConfig.SetCreateExternalLinks(true);
docConfig.SetLoadImages(true);
docConfig.SetCreateInternalLinks(true);
docConfig.SetPrintBackground(true);
PechkinSync._result = pechkin.Convert(docConfig, PechkinSync._source);
}
catch (Exception)
{
}
finally
{
try
{
if (pechkin != null)
{
pechkin.Dispose();
}
}
catch (Exception)
{
}
PechkinSync._callingThreadWaitHandle.Set();
}
PechkinSync._pdfThreadWaitHandle.Reset();
}
}
public static byte[] Convert(Pechkin.GlobalConfig config, string htmlSource)
{
if (string.IsNullOrEmpty(htmlSource))
{
throw new ArgumentException("htmlSource - value cannot be null or empty", "htmlSource");
}
lock (PechkinSync._syncRoot)
{
try
{
PechkinSync._config = config;
PechkinSync._source = htmlSource;
PechkinSync._callingThreadWaitHandle.Reset();
PechkinSync._pdfThreadWaitHandle.Set();
PechkinSync._callingThreadWaitHandle.WaitOne(20000);
byte[] result = PechkinSync._result;
return result;
}
finally
{
PechkinSync._config = null;
PechkinSync._source = null;
PechkinSync._result = null;
}
}
}
}
I have an ASP.NET 2010 web application, and I had the same problem as timcroydon:
However, I've noticed that if I make a change to code in VS2010, recompile and run (using built-in web server) it then hangs unless I bounce the web server so there's something not quite right.
I have tried using both SynchronizedPechkin and the PechkinSync class that Kons provided. It seems to work just fine until I recompile, then a request after a re-compile will hang the development server and max out a CPU core until I kill the server. PechkinSync does have the advantage of timing out and providing an error message, but the runaway thread remains.
I think the whole problem stems from keeping the library initialized. I fixed my issue by simply calling PechkinStatic.DeinitLib() at the end of SimplePechkin's Dispose() method. Then I just use SimplePechkin, using a simple lock to ensure that only one thread is using the library at one time. It might be doing extra work every time, but it never hangs, which is vastly more important.
In case someone wants my simple Sync class, which I inserted into my copy of the library:
namespace Pechkin
{
public static class PechkinSync
{
private static object _syncRoot = new object();
public static byte[] Convert(GlobalConfig config, string htmlSource)
{
lock (_syncRoot)
{
using (SimplePechkin pechkin = new SimplePechkin(config))
{
return pechkin.Convert(htmlSource);
}
}
}
}
}
@Kons & @Noodle56 Thanks for putting your code here - for some reason I was having the 100% issue even with the syncronised pechkin but used Noodles C# port of Kons VB code and seems to have settled down on the server now.
@bUKaneer I'm glad to hear you seem to be having success with PechkinSync, but I was still able to break it. I'm willing to bet it's only a matter of time before you encounter the problem again. See my post above.
@mattstermiller yeah i've seen it lock again today, I'll give your simplified version a go tomorrow ;o) Thanks for the update! We've also implemented a more severe solution by writing a windows exe to call so it executes outside the webserver altogether but tbh I'd rather keep everything in one place - the fewer "oddities" the better imho !
@mattstermiller - you nailed it. The issue is, that when you recompile, the web server process doesn't shut down and the native libraries aren't unloaded, but the original managed static thread reference is discarded. At this point since the library has been initialized by the thread that was discarded in the process of recompilation, it will hang after being accessed again by the new thread assigned to the static variable after recompilation.
I didn't investigate deeply, but this seems like a plausible chain of events.
Good find with PechkinStatic.DeinitLib(). I'll give that a try.
Update - although it doesn't hang, my method no longer renders HTML after the first conversion. All conversions after the first will output a PDF with all of the text content of the HTML, no styles or rendered HTML at all. It doesn't have to do with calling Init and Deinit (I tried calling these several times before the first conversion and it still worked), but there's something different after the first cycle of init, convert, deinit. Unfortunately, I don't have time to investigate this any more right now (higher priorities elsewhere).
I think @Kons pointed out the root of both this and #5
the web server process doesn't shut down and the native libraries aren't unloaded, but the original managed static thread reference is discarded. At this point since the library has been initialized by the thread that was discarded in the process of recompilation, it will hang after being accessed again by the new thread assigned to the static variable after recompilation.
I'm not familiar with web development using .net, and I don't have time to investigate it, unfortunately :(
First of all thanks Gmanny for the great library.
I tried following in mattstermiller's footsteps to use SimplePechkin in my VS 2012 C# .NET web app project and had the same problem as him where it only renders plain text to the PDF with no formatting the second time around after a single successful conversion. Had to restart VS to get things working again.
I've since found a workaround (a bit nasty) which involves two steps in conjunction:
I ran this line inside its own AppDomain:
byte[] pdfBuf = simplePechkin.Convert(gc, new Uri(filePath + ".htm"));
I used this code to unload the native DLL after AppDomain disposal:
foreach (ProcessModule mod in Process.GetCurrentProcess().Modules)
{
if (mod.ModuleName == "wkhtmltox0.dll")
{
while (FreeLibrary(mod.BaseAddress))
{
}
}
}
Along with a definition:
using System.Runtime.InteropServices;
[DllImport("kernel32", SetLastError = true)]
static extern bool FreeLibrary(IntPtr hModule);
@mattki Thanks for sharing your find! I was afraid it would take drastic measures like that.
Is it really necessary to run the conversion in its own AppDomain? Could it be possible to unload (and maybe re-load) the DLL? I'm just trying to think of ways to streamline it so that there could be a wrapper class around Convert() that could hide this complexity.
Without the separate AppDomain I get an AccessViolationException thrown on calling the convert method with detail "Attempted to read or write protected memory" - weird since I create a new instance of SimplePechkin and call FreeLibrary on the dll each time around. Maybe there are some handles to unmanaged code sticking around somewhere which don't get removed when unloading the dll or re-pointed when creating a new SimplePechkin.
Hey guys just wanted to add my experience with this issue.
I've been running a WCF service on a 64 bit 2008 R2 server for a while which uses SimplePechkin and noticed some 100% CPU Usage issues. It started to get pretty bad, and we're running quad core instances. So I tried switching over yesterday to the Synchronized version and so far so good, CPU usage is right down.
I'll be keeping an eye on it, but I'd take it if you don't hear from me again on here then it's solved the issue for me!
Howdy y'all... I ran across this yesterday and read the whole discussion. What I've done is taken the PechkinStatic class and rigged it up so that it has a private static AppDomain member. Any calls to PechkinBinding webkit methods are wrapped up in that AppDomain. DeInitlib on PechkinStatic now handles unloading the AppDomain and freeing the assembly. Dispose on SimplePechkin calls DeInitlib now. I have run the unit tests many many times and took the same steps to reproduce the problem to verify my results (that the problem is no more.) I added the FreeAssembly method onto the PechkinBindings class, and also gave that class five static members to hold the callbacks since it is in another AppDomain, otherwise garbage collection would sometimes nail the callbacks and cause problems.
I am new to Github so I will try my best to upload the updated solution in whatever way I can. Anyone has questions, just let me know.
Thanks a million bazillion to everyone in this thread for the wonderful contribution, thanks especially to mattki for his discovery and to gmanny for starting this project.
Hey everyone, I found out in a difficult way that there is still an issue with my fork that will cause hanging. To prevent the hanging in this instance all I had to do was comment out the part of InitLib where the callbacks are registered to the wkhtmltox0 assembly. I am commenting from the road now but I anticipate working on this issue over the next couple of weeks as I have time.
Alright, I have it worked out. I will be uploading the new commit to my own fork shortly, along with updates.
Here you go everyone, details and code:
here is thread stack when gets hang with SynchrionizedPechkin wkhtmltox0.dll!_ZN11wkhtmltopdf14ImageConverter11qt_metacastEPKc+0x110b8de wkhtmltox0.dll!_ZN11wkhtmltopdf14ImageConverter11qt_metacastEPKc+0x10b2efb wkhtmltox0.dll!_ZN11wkhtmltopdf9Converter16emitCheckboxSvgsERKNS_8settings8LoadPageE+0x7f wkhtmltox0.dll!wkhtmltopdf_convert+0x14 clr.dll+0x2cf7 clr.dll+0x2952 clr.dll!DllUnregisterServerInternal+0x18d93 clr.dll!GetMetaDataInternalInterface+0xe5a8 clr.dll!GetMetaDataInternalInterface+0xe4a7 mscorlib.ni.dll+0x2d371d mscorlib.ni.dll+0x2cf8fa mscorlib.ni.dll+0x30cacf mscorlib.ni.dll+0x3023d7 mscorlib.ni.dll+0x302316 mscorlib.ni.dll+0x3022d1 mscorlib.ni.dll+0x30cb4c clr.dll+0x2952 clr.dll!DllUnregisterServerInternal+0x18d93 clr.dll!DllUnregisterServerInternal+0x195d9 clr.dll!DllGetClassObjectInternal+0x10e29 clr.dll!DllGetClassObjectInternal+0x135f0 clr.dll!DllGetClassObjectInternal+0x1365e clr.dll!DllGetClassObjectInternal+0x1372b clr.dll!GetMetaDataInternalInterfaceFromPublic+0x21db2 clr.dll!GetMetaDataInternalInterfaceFromPublic+0x21e1b clr.dll!GetMetaDataInternalInterfaceFromPublic+0x21d98 clr.dll!DllGetClassObjectInternal+0x1365e clr.dll!DllGetClassObjectInternal+0x1372b clr.dll!DllUnregisterServerInternal+0x22e3 clr.dll!DllGetClassObjectInternal+0x10ce5 clr.dll!DllGetClassObjectInternal+0x13baf ntdll.dll!RtlInitializeExceptionChain+0x63 ntdll.dll!RtlInitializeExceptionChain+0x36
Thank you for that, sbmuzammil. Please look at my fork of Pechkin, or the pull request at https://github.com/gmanny/Pechkin/pull/42 for a solution.
I'm running Pechkin v0.5.8.1, and the first time the following code executes it runs fine, but the second time it jams at 100% CPU usage. This is happening on my local 32bit Windows 7 .net 4.5 machine, as well as our 64bit (IIS in 32bit mode) Win 2008r2 .net 4.5 server. Can you offer any thoughts? Looking at Pechkin's code I think it's in deadlock, or the logging component is going crazy, but I don't know how to check.
Code: