Closed tresf closed 4 years ago
Running the offending code through a debugger, the hang seems to occur in Win32PrintService.java:getCapabilities
which I believe calls some native function inside WPrinterJob.cpp
.
Note, I've also reached out directly to NewSoft Technology Corporation, the current owners of the Presto! line so that they're aware of the issue.
@tresf Before I spend time on this one: Do you have any news from anyone involved?
No reply from NewSoft, no updates on this issue since original filing.
@tresf Thanks for the great analysis (as usual). I tested with an upstream build of OpenJDK 11.0.7 and 14.0.1 and could verify the behavior. Looking at the C++ code and considering that it works well with other printers, I have doubts that the problem is actually in OpenJDK. Blacklisting printers in OpenJDK does not seem to be a sensible idea, either. Most likely, this problem is in PageManager. Do you agree on closing this issue? Otherwise, what's your suggestion?
Our solution downstream was to blacklist it by printer name, but that's only a partial workaround since the default printer name isn't guaranteed to stay that way.
Blacklisting printers in OpenJDK does not seem to be a sensible idea
Agreed.
Most likely, this problem is in PageManager.
Agreed.
Do you agree on closing this issue? Otherwise, what's your suggestion?
If it's decided to close as wont-fix, I'm OK with that, but I don't have enough information to know if that's what I would recommend. For example, if it's a bug in how Java loads all print drivers that's being exposed by Presto!, the issue will eventually come back and thus guarding against this deadlock will help the JDK moving forward.
On the other hand, if the issue is unrelated to Java code, then the bug should go elsewhere (Presto! or Microsoft perhaps?). At time of writing this, Presto! never got back to me (not even to acknowledge receiving it).
Did you dig deep enough to see if the C++ code was stuck somewhere that could be safely escaped? I'm curious where it dies. Even if it's not fixed, referencing the CPP source might help it down the road if it's resurrected, or if Presto! finally decides to correct it.
The C++ implementation is https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/a12f60a83fc87bbba2e2d5ade17f6241a8942aac/jdk/src/windows/native/sun/windows/WPrinterJob.cpp#L726. https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-devicecapabilitiesa looks like the corresponding Windows API which handles the details and seems to be synchronous. I do not see what could be done to work around this problem, but this isn‘t my area of expertise.
@aahlenst thanks kindly for linking the code. I haven't run it through a C++ debugger yet (I'm not a C++ developer so if I need to do this, it will take some time) however I noticed by glancing that the API guards for -1
in some areas, but I'm curious if it falls through in other areas?
Quoting Microsoft:
Return value
If the function succeeds, the return value depends on the setting of the fwCapability parameter. A return value of zero (
0
) generally indicates that, while the function completed successfully, there was some type of failure, such as a capability that is not supported. For more details, see the descriptions for the fwCapability values.If the function returns
-1
, this may mean either that the capability is not supported or there was a general function failure.
So an example like this is guarded:
int cReturned = ::DeviceCapabilities(printerName, printerPort,
dc_id, NULL, NULL);
RESTORE_CONTROLWORD
if (cReturned <= 0) { // ######## GUARDED FOR 0 or -1
JNU_ReleaseStringPlatformChars(env, printer, printerName);
JNU_ReleaseStringPlatformChars(env, port, printerPort);
return NULL;
}
... however some calls don't use <= 0
| > 0
such as copies as well as duplex.
Duplex worries me the most because it doesn't fallback on JNU_ReleaseStringPlatformChars
like the other sanitized calls, but instead starts adding DWORD bitwise operators to what could be -1
or 0
according to the API.
I'm sorry for speculative debugging but I don't want to blame Presto! if the issue is an unchecked Win32 API call.
I rather would expect that code to crash or to give funky results than to hang. But I have zero experience with Windows drivers and the JNI code around that, so 🤷♂️.
I do not have the expertise to help and it does not seem like the others have, either. Considering you have an executable test case and additional information, it might be worth a try to ask on an OpenJDK mailing list like jdk-dev. Maybe someone with expertise is inclined to respond.
I rather would expect that code to crash or to give funky results than to hang. But I have zero experience with Windows drivers and the JNI code around that, so 🤷♂️.
My experience with JNI and Windows has been limited as well. In my experience, the crashes occur when symbols or registers are incorrect. Adding insult to injury, I'm not even sure how to run java.exe
through a debugger. On Mac and Linux I've used the CLI tools which allow me to see the backtrace, but that still requires access to debug symbols. Assuming no one here is going to do that, I believe the next best way to debug this is to make a standalone executable (such as in Visual Studio) and run the Java 11 code as-is against the driver to catch the hang in the IDE. This is probably a good candidate since it relies on mostly win32 APIs and a few JDK headers.
In regards to reaching out to other channels (like jdk-dev) or staying here, I think it depends on a few factors:
I'd like to clarify that symptomatically, it's 100% a support problem. The end-user can't use the JDK if this driver is installed and that's going to punish the end-users and IT administrators trying to use these two products together. Fortunately, I've only run into this combination of Presto! and Java this one time, so perhaps someone with an invested interest in both products can push this through the correct channels at a future date. 🍻
Steps to reproduce
Platform and architecture:
Workaround
continue
in the loop when "PageManager PDF Writer" is found; avoiding the use of this printer.Downstream Bug Report