meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.25k stars 2.48k forks source link

SIGFPE, Arithmetic exception in g_hash_table_remove [1.x] latest master #3089

Closed zevarito closed 2 years ago

zevarito commented 2 years ago

What version of Janus is this happening on? 1.1.1

Have you tested a more recent version of Janus too? Yes

Was this working before? N/A

Is there a gdb or libasan trace of the issue?

Backtrace ``` [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/local/bin/janus -F /usr/local/etc/janus'. Program terminated with signal SIGFPE, Arithmetic exception. #0 0x00007f2a65a176a8 in g_hash_table_remove () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 [Current thread is 1 (Thread 0x7f2a6366b700 (LWP 9))] (gdb) bt #0 0x00007f2a65a176a8 in g_hash_table_remove () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #1 0x000055f2925290ad in janus_sctp_association_destroy (sctp=0x7f2a582a6600) at sctp.c:365 #2 0x000055f2924dd8f9 in janus_dtls_srtp_destroy (dtls=0x7f29dc08b170) at dtls.c:994 #3 0x000055f2924f1763 in janus_ice_peerconnection_destroy (pc=0x7f29dc08ae60) at ice.c:1700 #4 janus_ice_peerconnection_destroy (pc=0x7f29dc08ae60) at ice.c:1685 #5 0x000055f2924f1969 in janus_ice_webrtc_free (handle=0x7f2a28007900) at ice.c:1622 #6 0x000055f2924fde7b in janus_ice_outgoing_traffic_handle (handle=0x7f2a28007900, pkt=0x55f292575440 ) at ice.c:4460 #7 0x000055f292500fe1 in janus_ice_outgoing_traffic_dispatch (source=0x7f2a28007aa0, callback=, user_data=) at ice.c:495 #8 0x00007f2a65a29e6b in g_main_context_dispatch () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #9 0x00007f2a65a2a118 in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #10 0x00007f2a65a2a40b in g_main_loop_run () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #11 0x000055f2924e18cb in janus_ice_static_event_loop_thread (data=0x55f293069b70) at ice.c:200 #12 0x00007f2a65a530bd in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #13 0x00007f2a6547dea7 in start_thread (arg=) at pthread_create.c:477 #14 0x00007f2a6539daef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 ```

Additional context

Source Code sctp.c ```c void janus_sctp_association_destroy(janus_sctp_association *sctp) { if(sctp == NULL || !g_atomic_int_compare_and_exchange(&sctp->destroyed, 0, 1)) return; if(sctp->map_id != 0) { usrsctp_deregister_address(GUINT_TO_POINTER(sctp->map_id)); janus_mutex_lock(&sctp_mutex); g_hash_table_remove(sctp_ids, GUINT_TO_POINTER(sctp->map_id)); janus_mutex_unlock(&sctp_mutex); } if(sctp->sock != NULL) { usrsctp_shutdown(sctp->sock, SHUT_RDWR); usrsctp_close(sctp->sock); } janus_refcount_decrease(&sctp->ref); } ```

Hi! Just noticed a couple of servers crashed after ~24hs running with the above exception. Didn't noticed anything strange yet but still looking into the issue.

atoppi commented 2 years ago

There's not much data to work on unfortunately.

Some things that could help:

./configure CFLAGS="$CFLAGS" LDFLAGS="$CFLAGS"


Expect a minor performance hit for that instance.
Chances are that the SIGFPE is just a red herring and the real issue is inconsistent / freed memory access or invalid input data. 
zevarito commented 2 years ago

Thanks @atoppi for the guidance, I'll try those steps and report back. Before submit the issue tried to load glib symbols in the way you describe without success, but I have the core so I could test this once again. The issue has not been repeated so far. Let you know as soon as I found something.

lminiero commented 2 years ago

@zevarito any update on this?

zevarito commented 2 years ago

@lminiero I haven't seen the crash since then, and there was only one server that had the issue. I'll close this issue and re-open if anything new appears, thank you!