caprica / vlcj

Java framework for the vlc media player
http://www.capricasoftware.co.uk/projects/vlcj
1.15k stars 260 forks source link

Issues when using VLC 4.4.0 with OpenJDK 11+ and OpenJFX 12+ on Linux with GTK3 and Wayland #929

Closed xinyingho closed 4 years ago

xinyingho commented 4 years ago

I embed VLCj into my OpenJFX application using the EmbeddedMediaPlayer class. And I've come into issues because of VLCj and OpenJFX.

So first, some context is needed to understand the issue at hand: OpenJFX 11 migrated from GTK2 to GTK3 for its internal OS native operations but this raised a lot of issues on Linux distros (Fedora 31, Kubuntu 18.04/19.10) imposing Wayland as a graphical backend instead of X11. Indeed, OpenJFX doesn't have any Wayland compatible layers yet, and it's not within short-term Oracle plans to provide one. So, they did a quickwin: they added a hack to force GTK3 to work in X11-compatible mode by using XWayland.

Everything should be good but there are some issues creeping out when using VLCj on those Linux distros: using FileChooser to select a file immediately crashes the whole JVM because of a thread concurrency issue coming from GTK3... The source of this bug is because VLCj always initialises an AWT thread on Linux while this is not needed at all in my case. I suppose that this AWT thread forces all the JVM to work with Wayland and thus OpenJFX gets screwed over when it tries to run its hacky ways.

The fix is really easy: entirely remove the static init sequence from the MediaPlayerFactory class. LinuxNativeInit.init() creates a useless AWT thread and forces some X11 initialisation processes that are also not needed anymore (probably something that got fixed along the way OpenJFX-side).

caprica commented 4 years ago

You're assuming that the LinuxNativeInit.init() is not needed in use-cases other than your own.

Not every use-case of vlcj is an OpenJFX use-case.

As an example of a different use-case, without doing that initialisation in a Swing application, LibVLC will complain and can start behaving erratically.

It's not needed in your case, that's why as per the documentation, there's a VLCJ_INITX System property that can be used to suppress this behaviour.

So it needs a more robust solution than just saying remove the static init sequence.

xinyingho commented 4 years ago

Off course, my fix is only good for my particular use case :) That's why I didn't propose a PR for this particular issue as I don't know why the Linux global init sequence was implemented at first.

caprica commented 4 years ago

Using that system property should help in your use-case.

This issue actually shows itself in other places, I think this is related: #923, #472 and maybe some others.

caprica commented 4 years ago

I would definitely like to find a holistic solution to this issue, but it's like whack-a-mole - fix one use-case, another breaks.

xinyingho commented 4 years ago

lol, I like the whack-a-mole metaphor.

VLCJ_INITX only disabled the X11 pre-emptive init sequence but not the AWT part. Maybe another system property would be good enough?

caprica commented 4 years ago

OK, I'll take a look at it.

caprica commented 4 years ago

An example of why this static initialisation is needed, without it you get neverending log spam:

../include/vlc_xlib.h:46:vlc_xlib_init: Xlib not initialized for threads.
This process is probably using LibVLC incorrectly.
Pass "--no-xlib" to libvlc_new() to fix this.
caprica commented 4 years ago

664 seems to be implicated here.

caprica commented 4 years ago

@xinyingho are you sure it's the AWT initialisation that's the problem? is the XInitThread call ok?

xinyingho commented 4 years ago

I don't think that having libX11 gets initialised or not is any different in my case.

But yes, I'm sure that the issue is caused by the AWT initialisation: when I remove every dependencies on VLCj, FileChooser works fine on Wayland enabled Linux distros. This is also the case when I keep VLCj but disable the AWT init.

This issue is back when I enable the AWT init. Anyway, this AWT init is useless when using the EmbeddedMediaPlayer with OpenJFX/JavaFX and, since OpenJFX 11, is actually harmful. So, it's really proper to have a way to enable and disable it.

I don't have a simple use case ready for you to test other than testing my F/E for MAME called Negatron. Maybe you can test this with the JavaFX sample on Fedora or Kubuntu, and try to add a button calling FileChooser.show().

I really explained all the technical implications in the first post. VLC is working fine actually but activating this AWT thread has bad side-effects on stuff normally unrelated.

caprica commented 4 years ago

I have a test-case, I just wanted more information ;-)

From what I've seen, it should be possible to remove the AWT initialisation.

If I remove the XInitThreads I see strange errors in my native VLC logs, so I'm reluctant to remove that, hence my question.

caprica commented 4 years ago

More data...

Removing XInitThreads() results in this:

libva info: VA-API version 1.1.0
libva info: va_getDriverName() returns -1
libva error: va_getDriverName() failed with unknown libva error,driver_name=(null)
libva info: VA-API version 1.1.0
libva info: va_getDriverName() returns -1
libva error: va_getDriverName() failed with unknown libva error,driver_name=(null)
../include/vlc_xlib.h:51:vlc_xlib_init: Xlib not initialized for threads.
This process is probably using LibVLC incorrectly.
Pass "--no-xlib" to libvlc_new() to fix this.
../include/vlc_xlib.h:51:vlc_xlib_init: Xlib not initialized for threads.

If I do add "--no-xlib", it appears to work but the libva errors are still emitted.

caprica commented 4 years ago

More data...

If I execute this code:

        libvlc_instance_t instance = libvlc_new(0, new StringArray(new String[0]));

        libvlc_media_player_t mediaPlayer = libvlc_media_player_new(instance);
        libvlc_media_t media = libvlc_media_new_path(instance, args[0]);
        libvlc_media_player_set_media(mediaPlayer, media);

        libvlc_media_player_play(mediaPlayer);

No error.

But, if I cause AWT to be initialised, the error does get emitted, e.g.:

        Window window = new Frame();

        libvlc_instance_t instance = libvlc_new(0, new StringArray(new String[0]));

        libvlc_media_player_t mediaPlayer = libvlc_media_player_new(instance);
        libvlc_media_t media = libvlc_media_new_path(instance, args[0]);
        libvlc_media_player_set_media(mediaPlayer, media);

        libvlc_media_player_play(mediaPlayer);

It has to be a Window/Frame/JFrame to trigger the error, just creating e.g. a Canvas will not.

xinyingho commented 4 years ago

Ok, I did a real testing with "VLCJ_INITX" set to "no". And indeed, it fixes the bug...

It looks like I was so set with Wayland vs X11 bugs in OpenJFX that I didn't properly test this system property... So, yes, you were right, the AWT thread is actually not harmful here depsite being useless in my use case. Sorry for not listening more to your advice.

caprica commented 4 years ago

Well, that's good, but there is a real problem here that I would like to get to the bottom of - this issue crops up once in a while, and hard crashes are really not nice.

xinyingho commented 4 years ago

Ok :) Let me know if you need more help on this.

caprica commented 4 years ago

More random data collection...

It it the GLX video output module that reports this "Xlib not initialized for threads" error.

VLC does this on a sucess:

interopvlc
vlc_xlib_init()
_Xglobal_lock 0
_XErrorFunction 0
it calls XInitThreads() because global lock is null and error function is null
returning true from vlc_xlib_init

VDPAU
vlc_xlib_init()
_Xglobal_lock 7f2da8d6b640
_XErrorFunction 7f2da8a747a0
it calls XInitThreads() because global lock is not null
returning true from vlc_xlib_init

glx
vlc_xlib_init()
_Xglobal_lock 7f2da8d6b640
_XErrorFunction 7f2da8a747a0
it calls XInitThreads() because global lock is not null
returning true from vlc_xlib_init

On a failure:

interopvlc_xlib_init()
_Xglobal_lock 0
_XErrorFunction 7fa7a89e6530
it fails because lock is null and error function is not null
returning false from vlc_xlib_init

vaaapi
vlc_xlib_init()
_Xglobal_lock 0
_XErrorFunction 7fa7a89e6530
it fails because lock is null and error function is not null
returning false from vlc_xlib_init

interopvlc_xlib_init()
_Xglobal_lock 0
_XErrorFunction 7fa7a89e6530
it fails because lock is null and error function is not null
returning false from vlc_xlib_init

vaaapi
vlc_xlib_init()
_Xglobal_lock 0
_XErrorFunction 7fa7a89e6530
it fails because lock is null and error function is not null
returning false from vlc_xlib_init

EGL
vlc_xlib_init()
_Xglobal_lock 0
_XErrorFunction 7fa7a89e6530
it fails because lock is null and error function is not null
returning false from vlc_xlib_init

glx
vlc_xlib_init()
_Xglobal_lock 0
_XErrorFunction 7fa7a89e6530
it fails because lock is null and error function is not null
returning false from vlc_xlib_init

So in the case where AWT starts, and by the time the VLC code executes, the _Xglobal_lock is null, but there is an error function (presuambly AWT installs that error function, but that's just a guess).

caprica commented 4 years ago

When adding XInitThreads() to the above example code:

interopvlc_xlib_init()
_Xglobal_lock 7f6a448cf640
_XErrorFunction 7f6a076dd530
returning true from vlc_xlib_init

VDPAU
vlc_xlib_init()
_Xglobal_lock 7f6a448cf640
_XErrorFunction 7f6a076dd530
returning true from vlc_xlib_init

glx
vlc_xlib_init()
_Xglobal_lock 7f6a448cf640
_XErrorFunction 7f6a076dd530
returning true from vlc_xlib_init
caprica commented 4 years ago

It is not possible to null out the XErrorHandler, it translates null to be "default" instead.

caprica commented 4 years ago

Summary so far, there's some issue with the JVM/JDK setting an XErrorHandler that VLC does not expect to be there during its own initialisation.

I don't think anything can be done about that.

So currently we're stuck the VLCJ_INITX flag for whatever use case there is.

I think I can remove the AWT forced initialisation, but it's not really necessary. My recent tests have shown no problems if I remove the AWT initialisation.

The only other idea I had was to play a video clip with 1 frame in it, in a 1x1 VLC video output window, before creating an AWT Frame/JFrame - this is enough to initialise VLC properly (so it seems). This could be done once during media player factory creation. After that, it seems to be fine. But this is a pretty horrible thing to do IMO. It also enforces an ordering on library users to initialise the library before making any AWT initialisations.

There's no good overall solution here, which is why I'm punting it into the long grass for now.

caprica commented 4 years ago

What VLC expects...

    if (_Xglobal_lock == NULL && unlikely(_XErrorFunction != NULL))
        /* (_Xglobal_lock == NULL) => Xlib threads not initialized */
        /* (_XErrorFunction != NULL) => Xlib already in use */
        fprintf (stderr, "%s:%u:%s: Xlib not initialized for threads.\n"
                 "This process is probably using LibVLC incorrectly.\n"
                 "Pass \"--no-xlib\" to libvlc_new() to fix this.\n",
                 __FILE__, __LINE__, __func__);
    else if (XInitThreads ())
        ok = true;

If the _Xglobal_lock is null, and the error function is not null, LibVLC thinks there's a problem.

In the LibX sources, _Xglobal_lock is only initialised in "libx11/src/locking.c" in the XInitThreads method:

Status XInitThreads(void)
{
    if (_Xglobal_lock)
    return 1;

    // ...

    _Xglobal_lock = &global_lock;
    xmutex_init(_Xglobal_lock->lock);
    xmutex_set_name(_Xglobal_lock->lock, "Xlib global");

This is why calling XInitThreads in vlcj's initialisation works.

The problem then is what is the JVM doing when you start up AWT that causes it to hard fail? And why?

All I know so far is the JVM installs an Xlib error handler, without the _Xglobal_lock being initialised, and VLC says "nuh, uh".

caprica commented 4 years ago

With XInitThreads in a JavaFX application, this is the JVM crash log, part of it at least:

Current thread (0x00007fde10645800):  JavaThread "JavaFX Application Thread" [_thread_in_native, id=24434, stack(0x00007fdd8635a000,0x00007fdd8645b000)]

Stack: [0x00007fdd8635a000,0x00007fdd8645b000],  sp=0x00007fdd86453988,  free space=998k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libpthread.so.0+0x9fa0]  pthread_mutex_lock+0x0

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.sun.glass.ui.gtk.GtkCommonDialogs._showFileChooser(JLjava/lang/String;Ljava/lang/String;Ljava/lang/String;IZ[Lcom/sun/glass/ui/CommonDialogs$ExtensionFilter;I)Lcom/sun/glass/ui/CommonDialogs$FileChooserResult;+0 javafx.graphics
j  com.sun.glass.ui.gtk.GtkCommonDialogs.showFileChooser(Lcom/sun/glass/ui/Window;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;IZ[Lcom/sun/glass/ui/CommonDialogs$ExtensionFilter;I)Lcom/sun/glass/ui/CommonDialogs$FileChooserResult;+32 javafx.graphics
j  com.sun.glass.ui.gtk.GtkApplication.staticCommonDialogs_showFileChooser(Lcom/sun/glass/ui/Window;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;IZ[Lcom/sun/glass/ui/CommonDialogs$ExtensionFilter;I)Lcom/sun/glass/ui/CommonDialogs$FileChooserResult;+13 javafx.graphics
j  com.sun.glass.ui.CommonDialogs.showFileChooser(Lcom/sun/glass/ui/Window;Ljava/io/File;Ljava/lang/String;Ljava/lang/String;IZLjava/util/List;I)Lcom/sun/glass/ui/CommonDialogs$FileChooserResult;+121 javafx.graphics
j  com.sun.javafx.tk.quantum.QuantumToolkit.showFileChooser(Lcom/sun/javafx/tk/TKStage;Ljava/lang/String;Ljava/io/File;Ljava/lang/String;Lcom/sun/javafx/tk/FileChooserType;Ljava/util/List;Ljavafx/stage/FileChooser$ExtensionFilter;)Lcom/sun/glass/ui/CommonDialogs$FileChooserResult;+72 javafx.graphics
j  javafx.stage.FileChooser.showDialog(Ljavafx/stage/Window;Lcom/sun/javafx/tk/FileChooserType;)Ljava/util/List;+36 javafx.graphics
j  javafx.stage.FileChooser.showOpenDialog(Ljavafx/stage/Window;)Ljava/io/File;+5 javafx.graphics
j  uk.co.caprica.vlcj.javafx.demo.VlcjJavaFxApplication.start(Ljavafx/stage/Stage;)V+173 uk.co.caprica.vlcj.javafx.demo
j  com.sun.javafx.application.LauncherImpl.lambda$launchApplication1$9(Ljava/util/concurrent/atomic/AtomicBoolean;Ljavafx/application/Application;)V+20 javafx.graphics
j  com.sun.javafx.application.LauncherImpl$$Lambda$139.run()V+8 javafx.graphics
j  com.sun.javafx.application.PlatformImpl.lambda$runAndWait$12(Ljava/lang/Runnable;Ljava/util/concurrent/CountDownLatch;)V+1 javafx.graphics
j  com.sun.javafx.application.PlatformImpl$$Lambda$121.run()V+8 javafx.graphics
j  com.sun.javafx.application.PlatformImpl.lambda$runLater$10(Ljava/lang/Runnable;)Ljava/lang/Void;+1 javafx.graphics
j  com.sun.javafx.application.PlatformImpl$$Lambda$124.run()Ljava/lang/Object;+4 javafx.graphics
j  java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base@13.0.2
j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+13 java.base@13.0.2
j  com.sun.javafx.application.PlatformImpl.lambda$runLater$11(Ljava/lang/Runnable;Ljava/security/AccessControlContext;)V+7 javafx.graphics
j  com.sun.javafx.application.PlatformImpl$$Lambda$123.run()V+8 javafx.graphics
j  com.sun.glass.ui.InvokeLaterDispatcher$Future.run()V+4 javafx.graphics
v  ~StubRoutines::call_stub
j  com.sun.glass.ui.gtk.GtkApplication._runLoop(Ljava/lang/Runnable;Z)V+0 javafx.graphics
j  com.sun.glass.ui.gtk.GtkApplication.lambda$runLoop$11(Ljava/lang/Runnable;Z)V+7 javafx.graphics
j  com.sun.glass.ui.gtk.GtkApplication$$Lambda$112.run()V+12 javafx.graphics
j  java.lang.Thread.run()V+11 java.base@13.0.2
v  ~StubRoutines::call_stub
caprica commented 4 years ago

This is the JavaFX source containing the native method that triggers the failure, _showFileChooser:

final class GtkCommonDialogs {

    private static native FileChooserResult _showFileChooser(
                                            long parent,
                                            String folder,
                                            String filename,
                                            String title,
                                            int type,
                                            boolean multipleMode,
                                            ExtensionFilter[] extensionFilters,
                                            int defaultFilterIndex);
caprica commented 4 years ago

The only other thing I vaguely recall about this issue is that something changed in the JVM launcher, at a certain point in time, years ago, this was not a problem.

caprica commented 4 years ago

Interestingly, it is possible to do something here with a Java Agent.

Agents run before the JVM starts the application.

If you create an agent and have its implementation do XInitThreads(), it seems that this issue might go away.

The downside here is that adopting this would mean:

  1. Remove the LibX initialisation code from vlcj-core
  2. For any non-JavaFX application, it would need to be started with a VM option to enable the agent, e.g. "-javaagent:/home/install/vlcj/vlcj-agent-1.0.0.jar"

This is clearly not optimal, but it is comparable with the need for a JavaFX application to set the VLCJ_INITX=no system property.

The JavaFX application then will not crash when opening e.g. a file dialog, and would not require VLCJ_INITX=no, but this seems very contrived as all we're doing is preventing XInitThreads from being invoked because we've moved the initialisation code to an agent that the JavaFX application won't run.

I suspect, but have not tested, that this might resolve #392

But anyway, this is far from a "good" solution IMO.

xinyingho commented 4 years ago

As JavaFX is a separate library since Java 11+, maybe you can initiate XInitThreads() only when you don't detect JavaFX? There is a system property called "javafx.version", which is probably set by the module javafx.base.

caprica commented 4 years ago

That's an interesting idea. It's not foolproof, since I guess somebody might bundle JavaFX but not actually use it.

caprica commented 4 years ago

Isn't there an additional complication here?

Start with -DVLCJ_INITX=no (to address this issue) and then play media without e.g. opening a file chooser dialog. Here the XInitThreads will NOT have been called so VLC will complain (and some plugins will fail to properly initialise meaning sub-optimal rendering performance).

It's a real mess.

caprica commented 4 years ago

On my last comment, it seems to not be a problem for VLC when using JavaFX and the callback video surface (which all JavaFX applications will be using).

caprica commented 4 years ago

Another idea is to defer the XInitThreads() call unless and until an AWT video surface is used.

Since a JavaFX application won't use an AWT video surface, this might be a good approach.

caprica commented 4 years ago

To (hopefully) resolve this, I factored the bespoke Linux initialisation to the video surface rather than the media player factory.

This means the bespoke initialisation code is now only executed the first time when creating an AWT video surface (which will never happen in a JavaFX application).

In early testing it seems to work for both JavaFX and non-JavaFX applications.

fc167d4841188ee91afa41305bd6a530c41daef4

This could potentially be cherry-picked for vlcj-4.x.

caprica commented 4 years ago

This change means the LinuxNativeInit class can be removed from vlcj-natives.

caprica commented 4 years ago

Back-ported to vlcj-4.4.2 (Java 1.6+) and vlcj-4.5.2 (Java 11+/modules).

xinyingho commented 4 years ago

Wahoo, you fixed this issue properly and you even provided a 4.x version fully compatible with Java modules! You really went fast with fully supporting Java 11. Good job!

caprica commented 4 years ago

@xinyingho I'm so awesome :-D

You can thank Covid19 for all the free time I have.