Closed DarthBo closed 7 months ago
I think there is a possible race condition as the destructor for IceTransport
first disarm the timer then destroys the agent. However, it would be possible that the agent changes states between the two and rearms the timer, resulting in IceTransport::TimeoutCallback
being called later for a destroyed object.
Could you please check whether https://github.com/paullouisageneau/libdatachannel/pull/1103 fixes the issue?
Oof, I definitely never would have thought of that. Thanks for looking into it!
I'll pull the patch and see if we can reproduce it, will report back :)
Any news about this?
Oh sorry, I didn't realise it has already been 3 weeks :scream:
I haven't seen the issue since applying the patch, but I never really bumped into it before myself either. I was hoping to push a build to our beta testers (who have bumped into it) last week, but unrelated issues delayed that release.
It might be another couple of weeks before I can safely say it's fixed :sweat_smile:
I haven't seen a single libnice/datachannel related crash so far, so it certainly looks good :)
Great, thank you for testing!
I've seen some occasional crashes in the field that usually seem to come from IceTransport::TimeoutCallback.
I've had a look at the implementation of the libnice backend, I'm assuming it's a lifetime issue with the timer stuff, but I'm afraid my lack of experience with glib makes it hard for me to figure out what's going wrong. I also don't have a way to reproduce it yet, so I'm kind of hoping you'll take one look at the relevant code and say "aha, whoops" :sweat_smile:
libdatachannel version: 0.19.5 libdatachannel build flags: NO_WEBSOCKET, NO_MEDIA, USE_NICE libnice version: 0.1.21 libglib version: 2.72.3
last log prints:
stacktrace: