Closed felixfung closed 1 year ago
ok @felixfung i will be running this for a while. Some days / weeks. Hopefully all is fine. If any segfault occur... then will try to spot it by putting a follow into my dmesg output.
Like this!
sudo dmesg -T -p --follow-new | grep -i skippy-xd
And then leave the terminal open for some days, see if any segfault msgs gets printed (for example)
[id:~/.dev/skippy-xd] master(+6/-3)+* 130 ± sudo dmesg -T -p -w | grep -i skippy-xd
[Fri Mar 31 10:56:06 2023] skippy-xd[2035283]: segfault at 564700000010 ip 0000564796c447c8 sp 00007ffffbf5aac8 error 4 in skippy-xd[564796c3c000+15000] likely on CPU 7 (core 1, socket 0)
[Fri Mar 31 19:10:36 2023] skippy-xd[3189447]: segfault at 565100000010 ip 000056516b69a808 sp 00007ffc89ad7788 error 4 in skippy-xd[56516b691000+16000] likely on CPU 2 (core 2, socket 0)
i got some segfaults.... But I am not sure if its this or they were always there.
[Fri Mar 31 20:13:10 2023] skippy-xd[3316348]: segfault at 8 ip 000055c6ee352052 sp 00007ffdcf512f60 error 4 in skippy-xd[55c6ee34d000+16000] likely on CPU 3 (core 3, socket 0)
[Fri Mar 31 20:13:10 2023] skippy-xd[3348105]: segfault at 8 ip 000055be08585052 sp 00007fff09f6f230 error 4 in skippy-xd[55be08580000+16000] likely on CPU 10 (core 4, socket 0)
[Fri Mar 31 20:13:10 2023] skippy-xd[3348120]: segfault at 8 ip 000055f951bc3052 sp 00007ffd1b9ced30 error 4 in skippy-xd[55f951bbe000+16000] likely on CPU 6 (core 0, socket 0)
[Fri Mar 31 20:13:11 2023] skippy-xd[3348134]: segfault at 8 ip 000055573839b052 sp 00007ffdb0fa60b0 error 4 in skippy-xd[555738396000+16000] likely on CPU 10 (core 4, socket 0)
[Fri Mar 31 20:13:11 2023] skippy-xd[3348150]: segfault at 8 ip 0000561956d67052 sp 00007ffc94f97120 error 4 in skippy-xd[561956d62000+16000] likely on CPU 1 (core 1, socket 0)
[Fri Mar 31 20:13:12 2023] skippy-xd[3348164]: segfault at 8 ip 00005594af2c9052 sp 00007ffc8caad700 error 4 in skippy-xd[5594af2c4000+16000] likely on CPU 7 (core 1, socket 0)
:(
Any chance you are able to isolate the segfault(s)?
Any chance you are able to isolate the segfault(s)?
I would not expect too much... but the first thing I need to find where the system put those core file on my disk. (because there are several places it can be...).
To then look at the call stack. It is late here today. I have to get back to you another time.
But if testing over a longer timeframe (1-2 weeks). Then I can just revert back. And do another time window. To compare overall if any significant difference in the average number of segfaults. With and without this changeset. And if same type of segfaults occurs. Then that should give some answer, (if we cannot actually track down, cannot fix).
Other matter... please also let me know:
if you are beginning to use any linting tools like valgrind etc? (That maybe with static analysis tool can be giving at least 'new' warnings to look for, when adding new extra code).
if you are beginning to use any linting tools like valgrind etc? (That maybe with static analysis tool can be giving at least 'new' warnings to look for, when adding new extra code).
Yes I used valgrind to find the memory leaks.
Any chance you are able to isolate the segfault(s)?
I would not expect too much... but the first thing I need to find where the system put those core file on my disk. (because there are several places it can be...).
To then look at the call stack. It is late here today. I have to get back to you another time.
But if testing over a longer timeframe (1-2 weeks). Then I can just revert back. And do another time window. To compare overall if any significant difference in the average number of segfaults. With and without this changeset. And if same type of segfaults occurs. Then that should give some answer, (if we cannot actually track down, cannot fix).
Other matter... please also let me know:
You can compile with debug flags and launch the daemon in gdb, that should give you the stack of the segfaults?
You can compile with debug flags and launch the daemon in gdb, that should give you the stack of the segfaults?
Yes but I am currently out of time for that. (and to require reproducing again).
I am going to merge this and if we get regression we'll debug and revert.
I am going to merge this and if we get regression we'll debug and revert.
ok. i have not seen any additional leaks, and i am suspecting (more the case) to be when spamming keys, which was a scenario previously highlighted. And that is just pretty much expected behaviour with the way xorg / skippy works and so on. So no worries.
Please test just in case this causes regressions/segfaults.
There is still one leak of 24 bytes that I have not bothered fixing:
==24987== 192 (24 direct, 168 indirect) bytes in 1 blocks are definitely lost in loss record 65 of 89 ==24987== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==24987== by 0x116A10: dlist_add (dlist.c:47) ==24987== by 0x116A10: dlist_find_all (dlist.c:456) ==24987== by 0x10DB52: daemon_count_clients.constprop.0 (skippy.c:350) ==24987== by 0x112968: mainloop (skippy.c:905) ==24987== by 0x10D38C: main (skippy.c:1824)