Closed corvus1 closed 2 years ago
I can reproduce this crash as well. Gdb is blaming https://github.com/andrew-bibb/cmst/blob/master/apps/cmstapp/code/control_box/controlbox.cpp#L2264
I'll try to look into this over the weekend. Something has changed, likely exposing a programming error that was there all along.
Can either of you tell me how your system is configured? I'm not able to duplicate the crash, either by powering off and then on ethernet nor by unplugging the cable and then plugging it back in. The "no such file or directory" message is also confusing because at this point in the program it is not reading any file or directory. It is processing a QMap.
One other question, is this only occurring when running under gdb or did you run gdb because you were getting a crash? In the mean time I'll keep poking around looking for something that could be causing this.
Can either of you tell me how your system is configured?
On my system I can very reliably crash cmst by either pulling the ethernet cable and then plugging it back in or by bringing the inferface down and back up. ifconfig eth0 down/up
It crashes cmst when I bring it up every time. @benkohler told me that for him it's less reliable, and sometimes takes a couple of cycles up/down.
The "no such file or directory" message is also confusing because at this point in the program it is not reading any file or directory. It is processing a QMap.
You can simply ignore that message, I've been told that it's just GDB trying to open the source file, to show the fragment of code that it blames.
One other question, is this only occurring when running under gdb or did you run gdb because you were getting a crash? In the mean time I'll keep poking around looking for something that could be causing this.
The crash occures without gdb. GDB was introduced because of the crashes. In fact I only discovered it because my interface has a habit of momentarily going down and back up, probably because of some quirk of the hardware or the driver.
Any additional info about my setup you might need, I'll do my best to provide.
Thank you for the note about file or directory, that helps a great deal.
I just brought the interface up and down using ip and still can't reproduce the issue. Now that I know there are no external files involved it narrows down my search. Because of a segfault it sounds like the QMap is somehow being used uninitialized. That is where I'm going to look next.
The only other thing I would mention is that the issue seems to only affect ethernet interfaces and not wireless.
This is hard without being able to reproduce the error, but I may have located it. My hypothesis would explain why sometimes you get it and sometimes you don't. According to the gdb trace this is triggered after the dbsPropertyChanged() function is called. This function is only called when Connman manager issues a property changed (basically online, idle, ready or offline) signal. The segfault is triggered (line 2264) when we try to read the services_list. I think this may be a race where the services list has not been updated before the properties changed signal is sent. A change in properties requires some change in the services list and it appears that is not happening when the crash occurs. That is where I am going to start looking anyway.
Well, I can only promise you that when you need something tested I will be all over it. :+1:
Interestingly, it doesn't seem to crash when wifi goes up/down, even though a bit further in the same function (line 2272) it does the same for wifi interfaces.
If I am right on this the fix was very simple. I just uploaded my attempt at a fix. Unfortunately I seem to have added a bunch of extraneous files so need to find a way to clean them up here. GIt is not my strong suit.
Basically changed so I don't make a call to updateDisplayWidgets() after a property change. As I mention above, a property change must trigger a change in the services list so I just do the updateDisplayWidgets() after the services list changes. That way the services list should be up to date and hopefully won't crash.
I've tested here with wifi and ethernet (completely removing my VPN configurations to keep it simple) and it seems to work, but on the other hand it was working here before. These sometimes they crash (ethernet) and sometimes they don't (wifi) problems are miserable to track down.
I just rebuilt cmst from git, and I am happy to report that it seems to have fixed it. It'd be cool if @benkohler also tested it to be extra sure, but really it was crashing every time, and now it doesn't. So great job, thanks!
Great news. Wanted to mention that I never would have found the issue had you not posted the gdb printout in the first post. That was absolutely critical information to have for locating the problem.
I'll keep this issue open for a couple of weeks but think I'll do a formal release maybe this weekend. This is fixed (or seems to be) plus a lot of translations have come in. People working on those might like to see them exist someplace other than Github.
Great news. Wanted to mention that I never would have found the issue had you not posted the gdb printout in the first post. That was absolutely critical information to have for locating the problem.
Yeah, I figured it's hard enough as it is, and without backtrace there's just not much that can be done, aside from divination.
So, long story short. What I did is I ran cmst under gdb and brought eth0 down and then back up. Here's what I collected:
I apologize for that. It means "no such file or directory".