Closed brcolow closed 5 years ago
Is it possible that you were somehow falling back to T2K when running with JDK 9? The native font implementation, which calls DirectWrite on Windows platforms, has been the default since JDK 8, but we would fall back to T2K on Windows platforms without the necessary support (or via java -Dprism.text=t2k
). The T2K code was removed in JDK 10, so there is no longer a fallback path. The fallback should have only ever happened on Windows Vista, though.
Answering my own question, I see this in your log:
-Dprism.text=t2k
which means you were running T2K with JDK 9 and earlier. That flag is now ignored, so you are running the native font code with JDK 10. So that at least explains why the upgrade to JDK 10 has triggered this bug.
I filed JDK-8201539 in JBS.
That makes a lot of sense, thanks so much for tracking that down. I always wondered why we used that argument... 😆
If we trust the native back-trace, it says the access violation occurs at javafx_font!cacheDWRITE_GLYPH_METRICSFields+0x280:
which corresponds to void cacheDWRITE_GLYPH_METRICSFields(JNIEnv *env)
in directwrite.cpp
.
@ararunprasad You were very helpful with the MathML bug (https://github.com/javafxports/openjdk-jfx/issues/71), do you think you could provide a little guidance on this one?
@brcolow , let me take a look.
@brcolow @ararunprasad We roughly know what the problem is: Some initialization that is done by either the D3D pipeline or Glass (either is sufficient) is needed to allow the DirectWrite calls used by the native font code to work.
To get started on this issue, I took a look at the source code to figure out where the crash is happening, and that is here:
/* IWICImagingFactory*/
JNIEXPORT jlong JNICALL OS_NATIVE(CreateBitmap)
(JNIEnv *env, jclass that, jlong arg0, jint arg1, jint arg2, jint arg3, jint arg4)
{
IWICBitmap* result = NULL;
GUID pixelFormat;
switch (arg3) {
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat8bppGray:pixelFormat = GUID_WICPixelFormat8bppGray; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat8bppAlpha:pixelFormat = GUID_WICPixelFormat8bppAlpha; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat16bppGray: pixelFormat = GUID_WICPixelFormat16bppGray; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat24bppRGB: pixelFormat = GUID_WICPixelFormat24bppRGB; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat24bppBGR: pixelFormat = GUID_WICPixelFormat24bppBGR; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat32bppBGR: pixelFormat = GUID_WICPixelFormat32bppBGR; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat32bppBGRA: pixelFormat = GUID_WICPixelFormat32bppBGRA; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat32bppPBGRA: pixelFormat = GUID_WICPixelFormat32bppPBGRA; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat32bppGrayFloat: pixelFormat = GUID_WICPixelFormat32bppGrayFloat; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat32bppRGBA: pixelFormat = GUID_WICPixelFormat32bppRGBA; break;
case com_sun_javafx_font_directwrite_OS_GUID_WICPixelFormat32bppPRGBA: pixelFormat = GUID_WICPixelFormat32bppPRGBA; break;
}
HRESULT hr = ((IWICImagingFactory *)arg0)->CreateBitmap(arg1, arg2, (REFWICPixelFormatGUID)pixelFormat, (WICBitmapCreateCacheOption)arg4, &result);
return SUCCEEDED(hr) ? (jlong)result : NULL;
}
It seems that arg0
is the likely cantidate for being null
, and that is a pointer to the IWICImagingFactory instance, which should be set by the following method:
private static final native long _WICCreateImagingFactory();
which is implemented as native code in directwrite.cpp
thusly:
JNIEXPORT jlong JNICALL OS_NATIVE(_1WICCreateImagingFactory)
(JNIEnv *env, jclass that)
{
/* This routine initialize COM in order to create an WICImagingFactory.
* It runs on the prism thread and expects no other codes in this thread
* to interface with COM.
* Note: This method is called by DWFactory a single time.
*/
HRESULT hr = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE);
/* This means COM has been initialize with a different concurrency model.
* This should never happen. */
if (hr == RPC_E_CHANGED_MODE) return NULL;
IWICImagingFactory* result = NULL;
hr = CoCreateInstance(
CLSID_WICImagingFactory,
NULL,
CLSCTX_INPROC_SERVER,
IID_PPV_ARGS(&result)
);
/* Unload COM as no other COM objects will be create directly */
CoUninitialize();
return SUCCEEDED(hr) ? (jlong)result : NULL;
I will try and investigate further.
Edit: I found something kind of interesting with regards to CLSID_WICImagingFactory
https://github.com/google/skia/blob/master/src/ports/SkImageGeneratorWIC.cpp#L12 - but I have no idea if that would apply in our case or not.
Also, is it reliable to trust the java backtrace in the crash report, or is the native backtrace more reliable?
Piggy-backing off your comment @kevinrushforth - is there any chance you could be more specific? Is it that Monocle needs to initialize something with respect to DirectWrite that is only being initialized by Glass?
Edit 2: I don't think CLSID_WICImagingFactory
has anything to do with it, because we explicitly set WIN32_WINNT=0x0601
which targets Windows 7 APIs, not 8+.
I'm working on compiling OpenJFX locally on my Windows development machine, but for some reason the first build is taking hours upon hours (seems to be stalling at :graphics:buildModuleWin
). I am at least creating a simple build script that can be used when working locally on Windows to build without setting any environment variables, etc. Are there any such scripts that you guys at Oracle use internally when working on OpenJFX that are not in this repository, or is the configuration of the machines that you run builds on somehow standardized?
Edit 3: Got a first successful local build of OpenJFX. I don't have cygwin installed, so I think the docs are incorrect in requiring Cygwin. It may only be working because I have WSL installed, though. Anyways, to compile locally I had to make some very small tweaks to the gradle build file, and I also created a powershell script that may be useful for others, I will open a PR for that soon-ish.
Alright, I was able to debug using a locally built JavaFX and Visual Studio 2017 Community:
javafx_font.dll!Java_com_sun_javafx_font_directwrite_OSCreateBitmap(JNIEnv env=0x000001f2e1ec3a70, _jclass that=0x0000003f492fd0e8, __int64 arg0=2142694496368, long arg1=256, long arg2=256, long arg3=8, long arg4=1) Line 2359 at c:\users\brcolow\dev\openjdk-jfx\modules\javafx.graphics\src\main\native-font\directwrite.cpp(2359)
The crash happens on this line:
HRESULT hr = ((IWICImagingFactory *)arg0)->CreateBitmap(arg1, arg2, (REFWICPixelFormatGUID)pixelFormat, (WICBitmapCreateCacheOption)arg4, &result);
Dissasembly:
call qword ptr [rax+88] <--- I believe this is calling the function CreateBitmap on arg0
result
is highlighted red in the local variables table, though (meaning it's null - hence the illegal access exception I believe):
Name | Value | Type | |
---|---|---|---|
▶ | result | 0x0000000000000000 |
IWICBitmap * |
The docs for CreateBitmap
are https://docs.microsoft.com/en-us/windows/desktop/api/wincodec/nf-wincodec-iwicimagingfactory-createbitmap
It seems it is normal to call CreateBitmap
in this way, so perhaps it is that the CreateBitmap
call itself is returning null? Perhaps that is the missing initialization that you referred to, Kevin. I'm not sure how to tell the difference between result
being null and the method returning a null pointer. I will dig in deeper with the debugger and try and get to the bottom of what exactly is the problem. I have never debugged native code before, so it's a learning process for me.
Just wanted to post my progress, will update as I get more information.
Debugger screenshot:
The changes in this commit seem to stop the crash for me (unless for some bewitching reason logging statements are having some effect or maybe after a couple tries without restarting it works?):
https://github.com/brcolow/openjdk-jfx/commit/29c01b9b1852c5a4beaaae13ebbcec990ac7d78d
but I can't easily tell because compilation takes so long. If someone could test it with the changes (@kevinrushforth @ararunprasad ) and LMK if I'm seeing an apparition or not, that would be much appreciated. Then we could narrow down which change it is.
I was able to prevent the crash with only the following change to directwrite.cpp
:
JNIEXPORT jlong JNICALL OS_NATIVE(_1WICCreateImagingFactory)
(JNIEnv *env, jclass that)
{
/* This routine initialize COM in order to create an WICImagingFactory.
* It runs on the prism thread and expects no other codes in this thread
* to interface with COM.
* Note: This method is called by DWFactory a single time.
*/
HRESULT hr = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE);
/* This means COM has been initialize with a different concurrency model.
* This should never happen. */
if (hr == RPC_E_CHANGED_MODE) return NULL;
IWICImagingFactory* result = NULL;
hr = CoCreateInstance(
- CLSID_WICImagingFactory
+ CLSID_WICImagingFactory1,
NULL,
CLSCTX_INPROC_SERVER,
IID_PPV_ARGS(&result)
);
+ if (hr == REGDB_E_CLASSNOTREG) {
+ hr = CoCreateInstance(
+ CLSID_WICImagingFactory,
+ NULL,
+ CLSCTX_INPROC_SERVER,
+ IID_PPV_ARGS(&result)
+ );
+ }
/* Unload COM as no other COM objects will be create directly */
CoUninitialize();
return SUCCEEDED(hr) ? (jlong)result : NULL;
}
As soon as someone can re-produce the fix I'm seeing, I'll open a PR (I have a hard time believing it was this easy...).
@brcolow I can take a look early next week. As this is a P2 bug, we can still get a fix into openjfx 11 if a safe fix can be found.
I'm not sure if https://ci.appveyor.com/project/brcolow/openjdk-jfx/build/master%20267?fullLog=true proves that it fixes the crash or not, because I'm not sure if text is ever rendered to the screen without -PUSE_ROBOT=true
and -PFULL_TEST=true
(both of which make the time be exceeded on Appveyor). So, I can only reproduce the fix locally.
@kevinrushforth Do you think you will have time to take a look by tomorrow (Wednesday)? I know you are busy.
Or anyone else that can build OpenJFX locally on Windows - let me know (and save Kevin the work!).
I hope so... I'll put it on my list of reviews for tomorrow.
I ran a quick test, and it still crashes for me.
Huh...interesting. Okay, thanks. I will look more deeply into this.
@kevinrushforth
Check out this build which crashes (which only has the small changeset I mentioned above):
Now compare that to this build, with the same test (no crash there, and indeed no where on any other tests):
So it seems that the full changes (those in this PR):
https://github.com/brcolow/openjdk-jfx/pull/7
Are enough to stop the crash, probably at the expense of a memory leak as I mentioned earlier. Can you confirm? Thanks.
Edit: I messed up the PR I linked to, there are too many changes, let me fix that. Edit 2: Fixed.
I can confirm that with the patched referenced above, it does not crash. The only difference between that and the patch I tried last night was the following:
/* Unload COM as no other COM objects will be create directly */
- CoUninitialize();
+ //CoUninitialize();
return SUCCEEDED(hr) ? (jlong)result : NULL;
This is an interesting observation that may lead to a solution, but is not something we would want to use directly.
Yes, indeed. I was hoping it would provide the insight needed for someone who knows the Windows native code to arrive at a solution. If no such person exists, I will have to try and forge ahead with this but I am very much in the dark about DirectWrite/Windows Imaging Component/COM initialization/un-initialization and how that has anything to do with the SW renderer and Monocle.
How would we move the CoUninitialize
method call into the JavaFX shutdown/cleanup code?
Something that bugs me about this issue, that I am trying to find the answer to but the documentation on COM + Windows Imaging Component is pretty scarce is:
If COM is un-initialized (via CoUninitialize
) is it expected that the IWICImagingFactory COM object that we created is going to go stale? In other words, is it the Monocle + SW renderer case that's working as intended (i.e. it should crash trying to access the IWICImagingFactory after COM un-initialization) and the D3D renderer + WinApplication is doing something different to somehow keep COM initialized and therefore access to IWICImagingFactory succeeds?
That's a good question, and seems worth exploring.
Found something interesting by looking at the file history of directwrite.cpp
(tracking it through various renames):
http://hg.openjdk.java.net/openjfx/jfx-dev/rt/rev/932cecc14e49#l9.1
CoUninitialize
was once commented thusly:
CoUninitialize(); /* NOT COOL BUT SEEMS TO WORK */
This was changed and expounded upon by Felipe Heidrich
, is he still at Oracle? Does anyone know how to get in touch?
I was trying to find the RT-32255 issue mentioned in the commit, but am having trouble finding it since the old bug system was merged with the JDK Jira.
I think I have figured this out, or at least made progress towards that goal. When Monocle is not used the native WinApplication
is created. The native _runLoop
implementation in GlassApplication.cpp
creates an OLEHolder
:
/*
* Class: com_sun_glass_ui_win_WinApplication
* Method: _runLoop
* Signature: (Ljava/lang/Runnable;)V
*/
JNIEXPORT void JNICALL Java_com_sun_glass_ui_win_WinApplication__1runLoop
(JNIEnv * env, jobject self, jobject jLaunchable)
{
OLEHolder _ole_
// ...
MSG msg;
while (GlassApplication::GetInstance() && ::GetMessage(&msg, NULL, 0, 0) > 0) {
::TranslateMessage(&msg);
::DispatchMessage(&msg);
}
// ...
}
This OLE holder will be visible for the entire time the JavaFX application is running. The OLEHolder
is defined as a struct in OleUtils.h
thusly:
struct OLEHolder
{
OLEHolder()
: m_hr(::OleInitialize(NULL))
{
if (SUCCEEDED(m_hr)) {
STRACE(_T("{OLE"));
}
}
~OLEHolder(){
if (SUCCEEDED(m_hr)) {
::OleUninitialize();
STRACE(_T("}OLE"));
}
}
operator bool() const { return TRUE==SUCCEEDED(m_hr); }
HRESULT m_hr;
};
What does this have to do with anything? Well, according to the docs OleInitialize
calls CoInitializeEx
, which is the COM initialization. Therefore, when using the native glass Windows application, COM remains initialized the whole time the application is running. Which explains why :
CoUninitialize(); /* NOT COOL BUT SEEMS TO WORK */
was indeed true, given that the person who wrote that was most likely testing using the native glass Windows "toolkit".
When WinApplication
is not created, and instead MonocleApplication
is used, there is no extra OLE/COM initialization and so when directwrite.cpp
calls CoUninitialize();
the IWICImagingFactory
pointer starts pointing to garbage. COM should not be uninitialized until we are sure that no objects instantiated by COM are going to be called. The safest thing would perhaps be to initialize COM on demand (as we are doing), and just call into native Windows code (perhaps by creating a windows
directory in native-glass/monocle/
) at the end of MonocleApplication.runLoop
which un-initializes COM. What are your thoughts on that approach?
Here is a very rough proof of concept of the idea I outlined briefly in my last comment. Basically, introduce a native windows library for monocle and use it to shut-down COM when the application is exiting. I can't think of any way to be able to check on the Monocle side of things if the DirectWrite factory has initialized the WIC factory (given that getWICFactory
has default visibility, and I doubt we want to change that) so, on Windows at least, an unnecessary call (if in fact the WIC factory was never initialized, which could happen depending on an application's usage of fonts) on shutdown to un-initialize COM should be harmless enough.
The current approach is to add a check to HeadlessPlatform
(the only possible monocle platform on Windows) to see if we are on Windows and if so, call a native _shutdown
method. It may make more sense to make a new class, HeadlessWindowsPlatform
, that extends from HeadlessPlatform
and implements the COM un-initialize on shutdown. Then we could have HeadlessPlatformFactory
return an instance of it if PlatformUtils.isWindows
is true. Not sure how else to do it, really.
Currently the code will not compile, I just want to get the idea of my approach more concrete, and you can let me know what you think.
The most direct way to fix this, in my opinion, is to have monocle shut down COM when the application is shutting down. Currently we don't have any native libraries on Windows for monocle, so that complicates the fix.
The changes to appveyor.yml
are only temporary to ensure that the approach actually works, and will be reverted. The change to include monocle by default on Windows (WIN.includeMonocle = true;
) is something that would be really nice to have for TestFX purposes at least (and including it by default on Windows should make no difference to JavaFX users because actually having JavaFX use Monocle requires glass.platform=Monocle
to be set). Additionally, Monocle to be included on Linux and macOS as well, so that eventually we can stop packaging a separate library for this purpose. But, we can discuss the default inclusion of Monocle separately from this issue, if you prefer.
I realized that it isn't feasible to use the variants
gradle feature because only one can be used at a time (either glass
or glass_monocle
), so I think I would have to either:
1.) Add a new nativeTarget
for Windows builds for a new native library (maybe called monocle_win.dll
or something)
2.) Add the CoUninitialize function to an already existing native library, but then how to access it from Monocle?
I think I have the problem debugged, just unclear about the approach to take.
If it turns out that you really do need a native utility method that is called by Monocle (I'm not yet sure about that), then adding a native method to call CoUninitialize should be pretty straight-forward.
I do not like the first approach. It seems overkill to create a native library for Monocle just for this. We faced a similar issue for the javafx.swing
module and ended up using an existing library in javafx.graphics. Presuming that you put your native method in glass.dll
, which is the obvious choice if using an existing library, the only thing you would need to enable accessing it from Monocle is this one statement in the initialization code, from within a doPrivileged:
NativeLibLoader.loadLibrary("glass");
You can look here for the pattern used by javafx.swing
. Note that in the case of javafx.swing, the load method was added to a utility class in javafx.graphics. Since Monocle is already in the javafx.graphics module, your utility method can be co-located in the class that has the native method.
My bigger concern in all of this isn't what happens in the Monocle case. We don't ship the Monocle classes, so only developers who explicitly build and enable it will ever run in this mode. I'm more worried about any changes that could effect the normal Glass / D3D pipeline mode of operation. In that normal case, we need to ensure that removing the call to CoUninitialize will not have any negative effects.
The theory I am operating under is that CoUninitialize
should never have been called that early. The IWICImagingFactory
was still used by the time it was called. The only reason that it ever worked was that COM stayed initialized on that thread because of the COM life-cycle in WinApplication
runLoop.
In order to make this movie from a theory to an assertion, we really need someone who knows about COM to chime in. Do you know of anyone that could give us a sanity check on this assumption?
I agree with the approach you mention. I will work on that.
One thing that would be useful, in case we can't find someone who has COM expertise, would be to look at the private changelog (before JavaFX was open-sourced) and see when the CoUninitialize()
method call was added. Was it at the same time as CoInitializeEx
? Was it always commented with the NOT COOL
comment? I would love to dig deeper on my end, but the history only goes back so far.
I came up with a new approach that doesn't make any changes to Monocle. Instead, we use a disposer mechanism with IWICImagingFactory
and we have it's dispose
method call the native method that un-initializes COM. Here is that approach.
I believe my assumption about COM staying initialized because of WinApplication's OLEInitialize
was wrong because the initializations happen on different threads and the threading mode is set to single, apartment based threading. Unless Windows is doing something tricky like keeping COM objects alive because the parent (which is our case, I believe, as OLEInitialize
is called from the main application thread provided by Windows) thread still has COM initialized, this was probably a false assumption.
Quick sanity check: does it even make sense that with prism.order=sw
and glass.platform=Monocle
that SWGraphics
creates DWGlyphs
(that is, DirectWrite glyphs)? I mean, why should DirectWrite be used at all when using the SW renderer in headless mode? Shouldn't it instead use some sort of "software-only" (that is, not GL or D2D,D3D) font renderer?
The Windows native font rendering relies on DirectWrite, so yes, DWGlyphs are needed even in SW headless mode.
Hi @brcolow , I've been taking a look at this issue as well.
The theory I am operating under is that
CoUninitialize
should never have been called that early. TheIWICImagingFactory
was still used by the time it was called.
I'm no expert in COM either, but I do suspect that this is correct reason.
The crash occurs at:
000007FED842448F call qword ptr [rax+88h]
with an access violation from reading the memory location rax + 88h
. rax
contains the vtable location (The value of ((IWICImagingFactory *)arg0)->__vfptr
), so my hypothesis is that the location where the vtable resided became unmapped.
To confirm my hypothesis, I stepped through _1WICCreateImagingFactory
in the debugger:
IWICImagingFactory* result = NULL;
hr = CoCreateInstance(
CLSID_WICImagingFactory,
NULL,
CLSCTX_INPROC_SERVER,
IID_PPV_ARGS(&result)
);
At this point, result
pointer to a valid instance of IWICImagingFactory
, and its vtable was located at WindowsCodecs.dll!0x000007fefa037020
, which leads me to guess that WindowsCodecs.dll
was somehow unloaded, perhaps by CoUninitialize()
.
Comparing the list of loaded DLLs before and after the CoUninitialize()
call confirms my suspicion:
$ diff before.txt after.txt
89d88
< WindowsCodecs.dll WindowsCodecs.dll C:\Windows\System32\WindowsCodecs.dll N/A No Cannot find or open the PDB file. 95 6.2.9200.21830 (win8_ldr.160407-0600) 8/4/2016 1:57 AM 000007FEFA010000-000007FEFA171000 [8416] java.exe
Which means that the DLL WindowsCodecs.dll
was unloaded by the CoUninitialize()
call.
I believe my assumption about COM staying initialized because of WinApplication's
OLEInitialize
was wrong because the initializations happen on different threads and the threading mode is set to single, apartment based threading.
I don't think this disproves the hypothesis. The COM initialization on the other thread may cause it to hold a reference to WindowsCodecs.dll
, and since DLL reference counts are process-wide, WindowsCodecs.dll
is thus not unloaded when CoUninitialize()
is called.
@pyokagan Really glad to get someone else involved in this. So is it the case that with Monocle WinApplication
's OLEInitialize
is never called and therefore, when COM is uninitialized in directwrite.cpp
, the DLL reference count for WindowsCodecs.dll goes to 0 and thus IWICImagingFactory becomes garbage and the only reason that the early un-initialization of COM worked before was because WinApplication
's OLEInitialize
kept the reference count non-zero and thus IWICImagingFactory was never invalidated?
Do you have any ideas about the best fix for this issue? That is, what would be the best way to un-initialize COM at the right time? I tried using a disposer
mechanism but it didn't seem to work. We could un-initialize COM at application shutdown, but that seems a bit heavy-handed.
@brcolow
Really glad to get someone else involved in this. So is it the case that with Monocle
WinApplication
'sOLEInitialize
is never called and therefore, when COM is uninitialized indirectwrite.cpp
, the DLL reference count for WindowsCodecs.dll goes to 0 and thus IWICImagingFactory becomes garbage and the only reason that the early un-initialization of COM worked before was becauseWinApplication
'sOLEInitialize
kept the reference count non-zero and thus IWICImagingFactory was never invalidated?
Yes, that's the gist of it. In fact, we can actually confirm this by attaching windbg to the process to look at the load counts of loaded DLLs.
At the point just before CoUninitialize()
is called, when we are not using monocle, the load count of WindowsCodecs.dll
is 2:
Whereas when we are using monocle, the load count of WindowsCodecs.dll
is 1:
So this is very likely what is happening. Indeed, CoUninitialize()
should not have been called that early, and the only reason the non-monocle case doesn't crash is just a happy accident.
Do you have any ideas about the best fix for this issue? That is, what would be the best way to un-initialize COM at the right time? I tried using a
disposer
mechanism but it didn't seem to work.
Are you referring to https://github.com/brcolow/openjdk-jfx/pull/10/files ? I don't quite understand what's going on in the diff unfortunately :confused: I can't really see where the disposer mechanism is being used.
We could un-initialize COM at application shutdown, but that seems a bit heavy-handed.
I'm not familiar enough with the codebase to be able to make a confident suggestion, but from what I can tell the lifetime of IWICImagingFactory
is tied to the lifetime of the QuantumRenderer thread? In that case I suppose the correct thing to do would be to call CoUninitialize()
when the QuantumRenderer thread is shutting down, if IWICImagingFactory
was previously initialized. Something like this?
diff --git a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/FontFactory.java b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/FontFactory.java
index 46100cd..05271e6 100644
--- a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/FontFactory.java
+++ b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/FontFactory.java
@@ -132,4 +132,6 @@ public interface FontFactory {
float size, boolean register, boolean all);
public boolean isPlatformFont(String name);
+
+ public void dispose();
}
diff --git a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/PrismFontFactory.java b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/PrismFontFactory.java
index 3717d74..e68f0b9 100644
--- a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/PrismFontFactory.java
+++ b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/PrismFontFactory.java
@@ -2002,4 +2002,7 @@ public abstract class PrismFontFactory implements FontFactory {
/* Called from PrismFontFile which caches the return value */
static native short getSystemLCID();
+
+ @Override
+ public void dispose() {}
}
diff --git a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/DWFactory.java b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/DWFactory.java
index 431eb58..2e3f7a0 100644
--- a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/DWFactory.java
+++ b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/DWFactory.java
@@ -133,4 +133,14 @@ public class DWFactory extends PrismFontFactory {
}
return D2D_FACTORY;
}
+
+ @Override
+ public synchronized void dispose() {
+ checkThread();
+ if (WIC_FACTORY != null) {
+ OS.WICDestroyImagingFactory();
+ WIC_FACTORY = null;
+ }
+ super.dispose();
+ }
}
diff --git a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/OS.java b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/OS.java
index d49149e..d84119d 100644
--- a/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/OS.java
+++ b/modules/javafx.graphics/src/main/java/com/sun/javafx/font/directwrite/OS.java
@@ -175,6 +175,11 @@ class OS {
return ptr != 0 ? new IWICImagingFactory(ptr) : null;
}
+ private static final native void _WICDestroyImagingFactory();
+ static final void WICDestroyImagingFactory() {
+ _WICDestroyImagingFactory();
+ }
+
private static final native long _NewJFXTextAnalysisSink(char[] text,
int start,
int length,
diff --git a/modules/javafx.graphics/src/main/java/com/sun/prism/GraphicsPipeline.java b/modules/javafx.graphics/src/main/java/com/sun/prism/GraphicsPipeline.java
index f10d819..31991a8 100644
--- a/modules/javafx.graphics/src/main/java/com/sun/prism/GraphicsPipeline.java
+++ b/modules/javafx.graphics/src/main/java/com/sun/prism/GraphicsPipeline.java
@@ -59,6 +59,10 @@ public abstract class GraphicsPipeline {
public abstract boolean init();
public void dispose() {
+ if (fontFactory != null) {
+ fontFactory.dispose();
+ fontFactory = null;
+ }
installedPipeline = null;
}
diff --git a/modules/javafx.graphics/src/main/java/com/sun/prism/j2d/J2DFontFactory.java b/modules/javafx.graphics/src/main/java/com/sun/prism/j2d/J2DFontFactory.java
index 21a1647..6e1c9fc 100644
--- a/modules/javafx.graphics/src/main/java/com/sun/prism/j2d/J2DFontFactory.java
+++ b/modules/javafx.graphics/src/main/java/com/sun/prism/j2d/J2DFontFactory.java
@@ -237,4 +237,7 @@ final class J2DFontFactory implements FontFactory {
return srcFont;
}
+
+ @Override
+ public void dispose() {}
}
diff --git a/modules/javafx.graphics/src/main/native-font/directwrite.cpp b/modules/javafx.graphics/src/main/native-font/directwrite.cpp
index 96e751c..8507109 100644
--- a/modules/javafx.graphics/src/main/native-font/directwrite.cpp
+++ b/modules/javafx.graphics/src/main/native-font/directwrite.cpp
@@ -867,11 +867,14 @@ JNIEXPORT jlong JNICALL OS_NATIVE(_1WICCreateImagingFactory)
IID_PPV_ARGS(&result)
);
- /* Unload COM as no other COM objects will be create directly */
- CoUninitialize();
return SUCCEEDED(hr) ? (jlong)result : NULL;
}
+JNIEXPORT void JNICALL OS_NATIVE(_1WICDestroyImagingFactory) (JNIEnv *env, jclass that)
+{
+ CoUninitialize();
+}
+
JNIEXPORT jlong JNICALL OS_NATIVE(_1D2D1CreateFactory)
(JNIEnv *env, jclass that, jint arg0)
{
@pyokagan LGTM. You want to open a PR for that?
@brcolow On second thought, I'm not so confident in that solution, since it is fairly intrusive by adding a new dispose()
method to the FontFactory
interface, which all implementing classes would then need to implement. It's also hard to do right, there are some instances in the codebase where the FontFactory
is retrieved and then saved in an instance field:
I do get the feeling that FontFactory
is meant to outlive the QuantumRenderer thread, and so it's perhaps best not to violate this design.
I have an alternate solution, which is much less intrusive. Fibre local storages allow us to call callbacks on thread exit, so perhaps we could use it to call CoUninitialize()
when the QuantumRender thread exits?
diff --git a/modules/javafx.graphics/src/main/native-font/directwrite.cpp b/modules/javafx.graphics/src/main/native-font/directwrite.cpp
index 96e751c75d..cfbfca558a 100644
--- a/modules/javafx.graphics/src/main/native-font/directwrite.cpp
+++ b/modules/javafx.graphics/src/main/native-font/directwrite.cpp
@@ -845,6 +845,11 @@ jobject newD2D1_MATRIX_3X2_F(JNIEnv *env, D2D1_MATRIX_3X2_F *lpStruct)
/* */
/**************************************************************************/
+static void coUninitializeCallback(void *data)
+{
+ CoUninitialize();
+}
+
JNIEXPORT jlong JNICALL OS_NATIVE(_1WICCreateImagingFactory)
(JNIEnv *env, jclass that)
{
@@ -859,6 +864,12 @@ JNIEXPORT jlong JNICALL OS_NATIVE(_1WICCreateImagingFactory)
* This should never happen. */
if (hr == RPC_E_CHANGED_MODE) return NULL;
+ /* Setup a fibre local storage slot to call coUninitializeCallback() on thread exit. */
+ DWORD flsindex = FlsAlloc(coUninitializeCallback);
+ if (flsindex == FLS_OUT_OF_INDEXES) return NULL;
+ /* A non-NULL value must be set so that the callback will be called on thread exit. */
+ FlsSetValue(flsindex, (void*)1);
+
IWICImagingFactory* result = NULL;
hr = CoCreateInstance(
CLSID_WICImagingFactory,
@@ -867,8 +878,6 @@ JNIEXPORT jlong JNICALL OS_NATIVE(_1WICCreateImagingFactory)
IID_PPV_ARGS(&result)
);
- /* Unload COM as no other COM objects will be create directly */
- CoUninitialize();
return SUCCEEDED(hr) ? (jlong)result : NULL;
}
The advantage of this approach is that the changes are very small and entirely localized within directwrite.cpp
. FlsAlloc
is a pretty exotic API though (and only available on Windows Vista and up), I'm not sure if this would cause any trouble.
Alternatively, perhaps we could just remove the CoUninitialize()
call? Since IWICImagingFactory
is only created once in the entire lifetime of the process, any leaks should not be that bad...
Any thoughts?
I really like the all-native solution - I think that's really clever. I was looking for something like that in the Windows API but didn't find it. IIRC JavaFX only supports Windows Vista and up so that shouldn't be a problem. Removing the call to CoUninitialize
would indeed only introduce a really small leak (if the OS didn't automatically take care of it when the process is wholly terminated) but I think Kevin would be hesitant to approve a leaky solution.
Would you have time to open a PR for the solution? It will require filling out the Oracle Contributor Agreement (OCA). It is your solution so you should get credit 👍. After doing a little more research the Fiber-Local Storage API is a really good choice. I think it is only being used in it's thread-local storage capacity because no fibers are created. Because the thread-local storage API is older and harder to work with (not so easy to set a callback on thread exit), and because on newer versions TLS functions are in-lined to FLS functions, I think it's a wise choice and not too exotic.
@brcolow
Would you have time to open a PR for the solution?
OK I'll work on it.
Just to complete the picture, this is the stacktrace of the other callsite that loads WindowsCodecs.dll
the second time in the non-monocle case:
ModLoad: 00007ffa`ca180000 00007ffa`ca32e000 C:\Windows\SYSTEM32\WindowsCodecs.dll
# Child-SP RetAddr Call Site
00 00000003`955f9148 00007ffa`d1a2bea5 ntdll!NtMapViewOfSection+0x14
01 00000003`955f9150 00007ffa`d1a2bbf0 ntdll!LdrpMinimalMapModule+0xed
02 00000003`955f91d0 00007ffa`d1a432ce ntdll!LdrpMapDllWithSectionHandle+0x14
03 00000003`955f9230 00007ffa`d1a4231a ntdll!LdrpMapDllNtFileName+0x18a
04 00000003`955f9300 00007ffa`d1a4276b ntdll!LdrpMapDllSearchPath+0x1de
05 00000003`955f9560 00007ffa`d1a383da ntdll!LdrpProcessWork+0x83
06 00000003`955f95d0 00007ffa`d1a41a3d ntdll!LdrpLoadDllInternal+0x13e
07 00000003`955f9650 00007ffa`d1a418b6 ntdll!LdrpLoadForwardedDll+0x129
08 00000003`955f9960 00007ffa`d1a0f387 ntdll!LdrpGetDelayloadExportDll+0xa2
09 00000003`955f9a70 00007ffa`d1a22d96 ntdll!LdrpHandleProtectedDelayload+0x87
0a 00000003`955fa030 00007ffa`c7845871 ntdll!LdrResolveDelayLoadedAPI+0xc6
0b 00000003`955fa0c0 00007ffa`c784d24f COMCTL32!_delayLoadHelper2+0x31
0c 00000003`955fa100 00007ffa`c781e853 COMCTL32!_tailMerge_windowscodecs_dll+0x3f
0d 00000003`955fa170 00007ffa`c7820315 COMCTL32!CImageList::Initialize+0x253
0e 00000003`955fa220 00007ffa`d02b0525 COMCTL32!CSparseImageList::Initialize+0x55
0f 00000003`955fa270 00007ffa`d02b155a SHELL32!_InitSysImageLists+0xd5
10 00000003`955fa330 00007ffa`d04079e9 SHELL32!FileIconInitInternal+0x2ba
11 00000003`955fa3e0 00007ffa`d0408a85 SHELL32!_InitDragHelper+0x19
12 00000003`955fa410 00007ffa`cfc4a6d1 SHELL32!CDragDropHelper_CreateInstance+0x15
13 00000003`955fa440 00007ffa`cfc5844f combase!CServerContextActivator::CreateInstance+0x201 [onecore\com\combase\objact\actvator.cxx @ 878]
14 00000003`955fa5c0 00007ffa`cfc481c1 combase!ActivationPropertiesIn::DelegateCreateInstance+0xef [onecore\com\combase\actprops\actprops.cxx @ 1906]
15 00000003`955fa650 00007ffa`cfc4af22 combase!CApartmentActivator::CreateInstance+0xc1 [onecore\com\combase\objact\actvator.cxx @ 2169]
16 00000003`955fa700 00007ffa`cfc4ac80 combase!CProcessActivator::CCICallback+0x72 [onecore\com\combase\objact\actvator.cxx @ 1637]
17 00000003`955fa740 00007ffa`cfc4ad3e combase!CProcessActivator::AttemptActivation+0x40 [onecore\com\combase\objact\actvator.cxx @ 1524]
18 00000003`955fa790 00007ffa`cfc4b26b combase!CProcessActivator::ActivateByContext+0xae [onecore\com\combase\objact\actvator.cxx @ 1368]
19 00000003`955fa820 00007ffa`cfc5840d combase!CProcessActivator::CreateInstance+0x8b [onecore\com\combase\objact\actvator.cxx @ 1268]
1a 00000003`955fa870 00007ffa`cfc55dda combase!ActivationPropertiesIn::DelegateCreateInstance+0xad [onecore\com\combase\actprops\actprops.cxx @ 1906]
1b 00000003`955fa900 00007ffa`cfc58417 combase!CClientContextActivator::CreateInstance+0x14a [onecore\com\combase\objact\actvator.cxx @ 567]
1c 00000003`955fabb0 00007ffa`cfc5fa28 combase!ActivationPropertiesIn::DelegateCreateInstance+0xb7 [onecore\com\combase\actprops\actprops.cxx @ 1958]
1d 00000003`955fac40 00007ffa`cfc5ee79 combase!ICoCreateInstanceEx+0xa38 [onecore\com\combase\objact\objact.cxx @ 1959]
1e 00000003`955fbac0 00007ffa`cfc5ec83 combase!CComActivator::DoCreateInstance+0x169 [onecore\com\combase\objact\immact.hxx @ 385]
*** WARNING: Unable to verify checksum for \openjdk-jfx\build\sdk\bin\glass.dll
1f (Inline Function) --------`-------- combase!CoCreateInstanceEx+0x88 [onecore\com\combase\objact\actapi.cxx @ 177]
20 00000003`955fbbe0 00007ffa`ae2d655e combase!CoCreateInstance+0xc3 [onecore\com\combase\objact\actapi.cxx @ 121]
21 00000003`955fbc80 00007ffa`ae2eb2cf glass!GlassDropTarget::GlassDropTarget+0x9e [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\glassdnd.cpp @ 44]
22 00000003`955fbce0 00007ffa`ae2dee16 glass!ViewContainer::InitDropTarget+0x4f [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\viewcontainer.cpp @ 109]
23 00000003`955fbd50 00007ffa`ae2e4e7a glass!GlassWindow::Create+0x126 [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\glasswindow.cpp @ 154]
24 00000003`955fbde0 00007ffa`ae2e3cb3 glass!`Java_com_sun_glass_ui_win_WinWindow__1createWindow'::`2'::_MyAction::_UserDo+0x26a [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\glasswindow.cpp @ 1270]
25 00000003`955fbea0 00007ffa`ae2bf17b glass!`Java_com_sun_glass_ui_win_WinWindow__1createWindow'::`2'::_MyAction::Do+0x13 [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\glasswindow.cpp @ 1220]
26 00000003`955fbed0 00007ffa`ae2b1814 glass!GlassApplication::WindowProc+0xcb [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\glassapplication.cpp @ 150]
27 00000003`955fbf40 00007ffa`d1726d41 glass!BaseWnd::StaticWindowProc+0xa4 [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\basewnd.cpp @ 162]
28 00000003`955fbfa0 00007ffa`d172634e USER32!UserCallWinProcCheckWow+0x2c1
29 00000003`955fc130 00007ffa`d17260d8 USER32!SendMessageWorker+0x21e
2a 00000003`955fc1c0 00007ffa`ae2bea87 USER32!SendMessageW+0xf8
2b 00000003`955fc220 00007ffa`ae2e2280 glass!GlassApplication::ExecAction+0x37 [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\glassapplication.cpp @ 288]
2c 00000003`955fc250 000001ee`e13ef0b7 glass!Java_com_sun_glass_ui_win_WinWindow__1createWindow+0x60 [\openjdk-jfx\modules\javafx.graphics\src\main\native-glass\win\glasswindow.cpp @ 1297]
2d 00000003`955fc2c0 000001ee`fdadeb40 0x000001ee`e13ef0b7
2e 00000003`955fc2c8 00000003`955fc3a0 0x000001ee`fdadeb40
2f 00000003`955fc2d0 00000000`00000000 0x00000003`955fc3a0
Hi! I have the same problem with headless TestFX Unit Tests and open jdk 11 on a Windows 10 System
# A fatal error has been detected by the Java Runtime Environment:
#
# EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ffa1cb74879, pid=3804, tid=12072
#
# JRE version: OpenJDK Runtime Environment (11.0.1+13) (build 11.0.1+13)
# Java VM: OpenJDK 64-Bit Server VM (11.0.1+13, mixed mode, tiered, compressed oops, g1 gc, windows-amd64)
# Problematic frame:
# C [javafx_font.dll+0x4879]
Just for my understanding:
Will this be fixed in OpenJFX? Or is this an open jdk issue?
This is the same bug - @pyokagan is currently working on a PR. He has already posted a fix above so hopefully the PR will be opened, reviewed, and merged soon. It won't get in till OpenJFX 13, but that's the way the cookie crumbles. Also there should be no major obstacles to upgrading JavaFX versions like there was from 9 -> 10 so it should be quite painless.
It won't get in till OpenJFX 13
There is still a little over 3 weeks to get bug fixes into OpenJFX 12, although it will likely take some time to review this.
Speaking of which, my first thought is that an explicit dispose method that hooks into the existing FontDisposer mechanism might be a better way to go, but I haven't looked in detail.
I took a closer look and the Disposer mechanism won't work, since the cleanup would be on a singleton factory object that is kept in a static field.
Presuming that the change to avoid calling CoUninitialize too early is the right fix (which seems likely), then I see three approaches:
Add an explicit dispose method to FontFactory as @brcolow suggested above, and call it "at the right time", possibly by adding a Toolkit shutdown hook when the font factory is created. This will be called when the JavaFX runtime exits.
Use a native-only solution as @pyokagan suggested (which is effectively a native shutdown hook).
Remove the call to CoUninitalize
altogether (this doesn't seem like a good idea).
I think I prefer option 1, even though it is a little more intrusive.
@kevinrushforth
Add an explicit dispose method to FontFactory as @brcolow suggested above, and call it "at the right time", possibly by adding a Toolkit shutdown hook when the font factory is created. This will be called when the JavaFX runtime exits.
If I'm reading the source code correctly, shutdown hooks (notifyShutdownHooks()
) would only be called in the application thread.
CoUninitialize()
needs to be called on the thread which called CoInitializeEx()
, which in this case is the QuantumRenderer
thread. This makes things significantly trickier. I guess the calling of the dispose method would need to be called by QuantumRenderer
or GraphicsPipeline
.
I have a patch in https://github.com/javafxports/openjdk-jfx/issues/66#issuecomment-450498291 which makes GraphicsPipeline
call FontFactory#dispose()
. I wan't so confident about the patch though, for the following reasons:
It could be possible that PrismFontFactory#getFontFactory()
could be called (as it's called by other classes) without GraphicsPipeline#getFontFactory()
being called, which would still lead to a resource leak. The users of PrismFontFactory#getFontFactory()
could all be switched to GraphicsPipeline#getFontFactory()
, but I'm unsure if that's a good thing because...
PrismFontFactory
calls GraphicsPipeline#getFontFactory()
. I'm wondering why it's using reflection in the first place, and if the performance penalty to always doing a reflective lookup would be acceptable. It seems like code in the com.sun.javafx.font
package tries to avoid having a direct dependency on com.sun.prism
(presumably to avoid a circular dependency?). Git history doesn't say anything about it.
FontFactory
to have a dispose method, and GraphicsPipeline to call it, and it only having an effect for DTFactory
and not for any of the other FontFactory
implementations (CTFactory
and FTFactory
). (Should it dispose resources used by FontFiles
as well, for example? :thinking: )It also didn't seem right to me for
FontFactory
to have a dispose method, and GraphicsPipeline to call it, and it only having an effect forDTFactory
and not for any of the otherFontFactory
implementations (CTFactory
andFTFactory
). (Should it dispose resources used byFontFiles
as well, for example? thinking )
Scratch that, it doesn't need to be a full disposal method. It could be a simple notifyRelease()
method which just notifies the FontFactory
that the GraphicsPipeline
is shutting down, and it is up to the FontFactory
to perform whatever internal cleanup it deems necessary.
I think I prefer option 1, even though it is a little more intrusive.
Generally speaking I do think that option 1 would be more ideal as well, since it makes the cleanup more explicit (and not using an exotic Win32 API is a nice bonus).
After adding a Java 10 build to our (TestFX) appveyor configuration, I am seeing a crash (every time at the same place):
https://ci.appveyor.com/project/testfx/testfx/build/master%201046/job/3pl22fioi0ut9juc#L492
It is triggered by this test.
I believe it is crashing in modules/javafx.graphics/src/main/native-font/directwrite.cpp in the
JNIEXPORT jlong JNICALL OS_NATIVE(CreateBitmap)
method. This is somewhat strange because that file hasn't been modified since the javafx-font and javafx-font-native modules were open sourced (as RT-31139) so figuring out why this would precipitate a crash now in Java 10 is probably non-trivial.Here is the crash dump: