gstreamer-java / gst1-java-core

Java bindings for GStreamer 1.x
GNU Lesser General Public License v3.0
188 stars 72 forks source link

Callback JVM crash #204

Closed honcharenko-roman closed 3 years ago

honcharenko-roman commented 3 years ago

Hi, I am trying to add a format-location callback for my splitmixsink element, but after 30-60 seconds of work, JVM is crashing with different errors like

- double free or corruption (fasttop)
- free(): double free detected in tcache 2
- free(): invalid pointer
- corrupted size vs. prev_size
- corrupted size vs. prev_size in fastbins
- free(): invalid next size (fast)
- free(): double free detected in tcache 2
- malloc_consolidate(): invalid chunk size
- malloc(): smallbin double linked list corrupted
- double free or corruption (out)
- free(): invalid next size (fast)

Working at 0.9.4 I had and it worked well.

splitmuxsink.connect("format-location", new Closure() {
            public String invoke() {
                return String.format(PATH, camera.getCameraIndex())
                        + System.currentTimeMillis() / 1000L
                        + ".mp4";
            }
        });

According to gstreamer documentation format-location have next parameters: Element, id, pointer.

I had the same callback in python works with:

def format_location_callback(splitmux, fragment_id, camera_index):
    return f'{CAMERAS_DIRECTORY}/camera{camera_index}/videos/{int(datetime.timestamp(datetime.now()))}.mp4'

src = pipeline.get_by_name('splitmuxsink0')
src.connect("format-location", format_location_callback, camera.camera_index)

But 1.2 provided a new way to work with callback signals. I came to the interface

interface FormatLocationCallback extends GstAPI.GstCallback {
        String callback(Element splitmux, int fragmentId);
    }

And implementing it in connect:

FormatLocationCallback formatLocationCallback = (splitmux, fragmentId) -> {
            return String.format(PATH, camera.getCameraIndex())
                    + System.currentTimeMillis() / 1000L
                    + ".mp4";
        };

        splitmuxsink.connect("format-location",
                FormatLocationCallback.class,
                formatLocationCallback,
                formatLocationCallback
        );

No matter with Pointer as the last parameter, or without, JVM crashes. I've tried to define an abstract class instead of an interface, but then the program hangs at the connecting stage. Also tried with a different number of parameters or totally without them - the result is the same. While not connecting the callback, the pipeline works well.

neilcsmith-net commented 3 years ago

That API isn't new in 1.0+, just the Closure version was removed. The issue sounds like your callback is getting garbage collected. Without seeing more of the code I'm not sure why, because connect() should keep a reference. Although if you look at how these are implemented in eg. Element you'll notice that the same listener isn't passed to both arguments. You shouldn't have to expose anything in lowlevel (eg. extend GstCallback) in a public API.

Are you keeping a reference to the splitmuxsink? Or where does the reference for that come from?

We might consider a generic signal handler for external use, somewhat replacing what Closure could be used for. The remaining connect() methods are currently undocumented as they're partially an implementation detail, but necessary to achieve a few things. They would ideally be protected not public.

honcharenko-roman commented 3 years ago

I forgot to mention the most important part - I am using this approach in multithreading. But trying it with one thread crashing JVM but a bit later (around 1:30-2:00 minutes). I have a string, I am parsing to pipe. I had collision troubles with another element, that's why I am indexing element

+ " name=splitmuxsink" + camera.getCameraIndex()
+ " muxer=\"mp4mux name=muxxxer reserved-moov-update-period=1000\" max-size-time=10000000000"

Then I am getting element by parse and connecting callback to it. Nothing more. Element splitmuxsink = pipe.getElementByName("splitmuxsink" + camera.getCameraIndex());

If replace format-callback with a template in pipeline like location="video%05d", works well in multithreading.

After setting callback, I am starting a pipeline and waiting for the USB device to be disconnected.

neilcsmith-net commented 3 years ago

Try testing with some regular System.gc() calls, which should prove whether this is being triggered by garbage collection or something else. There shouldn't be a problem with multithreading, although I would question what you're multithreading and why you think it's necessary.

honcharenko-roman commented 3 years ago

I have to run multiple instances of GStreamer pipeline for different cameras, I want simultaneously record from different cameras at the same time.

Using System.gc() once in a second while recording, I am getting the same message. Sometimes JVM crashes with the log like this: gist

Seems like duration before the crash is not connected to the System.gc() calls.

neilcsmith-net commented 3 years ago

But you don't need multiple Java threads to run multiple GStreamer pipelines. They have their own threads. Ideally just use Gst.invokeLater to set up and run the pipelines.

If you can share a fully working simple example that reliably replicates the issue we can take a look.

honcharenko-roman commented 3 years ago

I'm using it at RPI3/RPI4, recording with omxh264enc. But trying on host PC with x264enc gave the same JVM crash. In the separate thread, I'm reading "gst-device-monitor -f" and on new device connected, I am running new separate thread to make connected camera record and stream. I tried different ways of creating separate threads, Gst.invokeLater(), and Gst.getExecutor() also, but the problem remains. I believe it has something to do with my program architecture because running hardcoded instead of monitoring cameras increased time cameras were working. So anyway, I rollbacked for 0.9.4 and Closures, and it works flawlessly. Because of the lack of time to investigate further, I will keep to this version for now. Thanks a lot for your help and fast responses!

neilcsmith-net commented 3 years ago

Fair enough! Without code to reproduce, it's hard to know what's wrong. But 0.9.4 has known memory leaks - I believe you might actually be benefiting from that situation because somewhere the binding is keeping a reference alive that you should be. I'll close this.