Closed phsultan closed 10 years ago
Hi Philippe,
have you done a make clean
when switching changing the nodatachans
flag? If you didn't, you actually used the same code.
Not sure about what's causing the crash, since it seems to be failing when accessing stream->handle
, right after it checked that stream
is actually not null. I'm wondering whether there may be some stack corruption there, even if that shouldn't be the case.
I'm unable to replicate the issue so causes may be different. Is there any additional information you can get out of gdb, e.g., whether stream
is actually a valid janus_ice_stream
instance and not some junk pointer?
Thanks for your kind words, BTW :-)
Yep, I had run make clean
before reinstalling. I'm indeed facing a stack corruption here, as shown by gdb:
#0 0x000000000041a74f in janus_ice_cb_nice_recv (agent=0x7fe3d0036620, stream_id=2, component_id=1, len=88, buf=0x7fe39a1ebbd0 "\001\001", ice=0x7fe3d007dde0) at ice.c:536
component = 0x7fe3d007dde0
__FUNCTION__ = "janus_ice_cb_nice_recv"
stream = 0x2020200a2c343332
handle = 0x7fe3d0036620
....
The memory pointer for stream
cannot be accessed:
(gdb) x/x 0x2020200a2c343332
0x2020200a2c343332: Cannot access memory at address 0x2020200a2c343332
Upper in the stack(in component_io_cb
), stream
has a valid address though, so it's value is likely modified somewhere in the path. Here is how stream
looks like in component_io_cb
:
gdb) print {janus_ice_stream}0x7fe3d007d4a0
$2 = {handle = 0x0, stream_id = 2, cdone = 0, audio_ssrc = 1, video_ssrc = 0, audio_ssrc_peer = 2617254752, video_ssrc_peer = 32739, payload_type = 0,
dtls_role = JANUS_DTLS_ROLE_SERVER, ruser = 0x6246396b <Address 0x6246396b out of bounds>, rpass = 0x0, components = 0x0, rtp_component = 0x0, rtcp_component = 0x0, noerrorlog = 0,
mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>,
__align = 0}}
(gdb)
Since those functions are exported by libnice
and glib
, I'd like to have mine match with yours. Can you provide me with your versions of libnice
and glib
? I'm running libnice-0.1.7
and glib-2.36.4
.
Thanks!
Philippe
libnice is 0.1.4 and glib2 is 2.34.2: my Fedora 18 doesn't have the latest stuff :-) What OS are you using? I'll try to replicate the issue with a VM.
I'm on a CentOS 6.5
Looking at the dump again, it may be some kind of race condition. In fact, I see that the stream id that is passed is 2, which is normally associated with the video stream. When Bundle is involved, though, both audio and video share the same ICE stream (1), and the second stream is removed. So this may be a scenario where the second stream is discarded too late, while already in use by libnice.
Are you using Chrome or Firefox for your tests?
I'm using Chrome. Switched to Firefox 30.0 and successfully connected 3 people to the same room without any crash, though my CPU went up to 199% :-D
Were you testing both server and clients on the same mahine? The clients CPU can really grow as soon as you start involving more flows at the same time, especially when you have 3-4 clients all handling 3-4 streams! The server side itself shouldn't be affected much with just 3-4 users in the MCU, at least not according to the measurements we did some weeks ago.
If everything went fine with Firefox, I guess the issue is indeed a race condition somewhere in the process of handling the bundle switch. I'm already looking into it and hope to have something ready soon.
I meant the CPU on the server that runs janus actually, which is a cloud instance. But that's another thing I believe. I did not check the CPU on my laptop, which I indeed connected 3 times to the server.
Thanks a lot Lorenzo !
I just pushed a commit that should better handle the case when the gateway is offering (which is what happens when a second participant joins the room and you attach to its feed). Let me know if anything improves.
Unfortunately no, it did not help.
Can you launch Janus with a higher debugging level (-d 5
) and pass me the log up to the point where it crashes? I don't know if files can be attached on issues so if not, since I'd rather avoid having long dumps of text here, please send me the log privately (lorenzo[at]meetecho.com).
Closing as this should have been fixed, feel free to reopen otherwise.
Hi, I'm getting a segfault running the video MCU test example. Here is my testing environment:
nodatachans
install optionMy NAT configuration section in janus.cfg: [nat] public_ip = 1.2.3.4 stun_server = stun.voip.eutelia.it stun_port = 3478
The echo test works perfect, but the video MCU test makes janus crash right after the second participant enters the room during ICE negotiation (line 536 in ice.c). Here is the backtrace:
Let me know if you need more information, and congrats for the great work Lorenzo !
Philippe