Closed ashaw596 closed 7 years ago
@justbuchanan Have you seen this?
Hmm, I haven't seen this before but I haven't run many plays lately.
I just tried running Basic122 and it didn't crash after a few minutes of messing with it. Is there anything in particular you can do to make it crash?
No idea. It crashes randomly every like once every couple times after running for 5 minutes. On my Mac at least.
Hmm, I haven't seen this before but I haven't run many plays lately.
I just tried running Basic122 and it didn't crash after a few minutes of messing with it. Is there anything in particular you can do to make it crash?
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/RoboJackets/robocup-software/issues/674#issuecomment-220746787
Steps to replicate:
I think the segfault is just from PyQt trying to access memory once we run out. It's probably our fault, not PyQt's.
This is clearly the best solution however plugging the leaks would probably be a good idea too.
I still don't think this is a memory leak (this is my top when I got a segfault):
I have 16gb of memory though (and soccer is using about 5% or about a gig). There could be issues with soccer not handling a malloc correctly (or pyqt). Were you running other memory intensive programs while you were doing this? (could you show if soccer itself was running out of memory, or other programs + soccer).
I don't think this is related to running out though, since I have lots of memory left (It could be two seperate issues though...)
Thread 8 "QThread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbffff700 (LWP 24438)]
0x0000000000000021 in ?? ()
(gdb) bt
#0 0x0000000000000021 in ?? ()
#1 0x00007fffd9f6fd11 in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#2 0x00007fffd9b8aa28 in ?? () from /usr/lib/python3/dist-packages/sip.cpython-35m-x86_64-linux-gnu.so
#3 0x00007fffd9b95deb in sip_api_convert_from_type () from /usr/lib/python3/dist-packages/sip.cpython-35m-x86_64-linux-gnu.so
#4 0x00007fffd9e7053f in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#5 0x00007fffd9e706c7 in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#6 0x00007fffd9e70702 in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#7 0x00007fffd9e70f52 in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#8 0x00007ffff423b319 in PyCFunction_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#9 0x00007ffff4343644 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#10 0x00007ffff43435b8 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#11 0x00007ffff43435b8 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#12 0x00007ffff43435b8 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#13 0x00007ffff44039a4 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#14 0x00007ffff43418fe in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#15 0x00007ffff43435b8 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#16 0x00007ffff44039a4 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#17 0x00007ffff43418fe in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#18 0x00007ffff44039a4 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#19 0x00007ffff43418fe in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#20 0x00007ffff43435b8 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#21 0x00007ffff44039a4 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#22 0x00007ffff4403a83 in PyEval_EvalCodeEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#23 0x00007ffff433b38b in PyEval_EvalCode () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#24 0x00007ffff43380af in PyRun_StringFlags () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#25 0x00007ffff77dd377 in Gameplay::GameplayModule::run (this=0x6ecf30) at ../soccer/gameplay/GameplayModule.cpp:397
#26 0x00007ffff7978944 in Processor::run (this=0x6a16a0) at ../soccer/Processor.cpp:419
#27 0x00007ffff6007d78 in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#28 0x00007ffff58d3444 in start_thread (arg=0x7fffbffff700) at pthread_create.c:333
#29 0x00007ffff364b20d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Also, did you run plays while doing this? I was running basic122.
I was running basic122 as well, but yeah, listed by process soccer was using like >80% of my ram.
Yah, I'm seeing it now. I think there is a memory leak in soccer (unfortunatley) but I'm not sure if it's related to this pyqt segfault...
Yeah.... Our logs system eats ram continuously too. Probably a couple memory looks too.
On Wed, Sep 7, 2016 at 1:38 PM, Jay Kamat notifications@github.com wrote:
Yah, I'm seeing it now. I think there is a memory leak in soccer (unfortunatley) but I'm not sure if it's related to this pyqt segfault...
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/RoboJackets/robocup-software/issues/674#issuecomment-245358904, or mute the thread https://github.com/notifications/unsubscribe-auth/AB8XBrjdzGUFWqeRFeMef92bNpICrkhfks5qnvZ-gaJpZM4Ii58P .
Running valgrind gives me a ton of these
==22530== Invalid read of size 4
==22530== at 0x82F57BB: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8473466: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8474C46: PyNode_Free (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x83FEA8A: PyParser_ASTFromStringObject (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x83FED41: Py_CompileStringObject (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x840F75A: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8302318: PyCFunction_Call (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8409974: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84CA9A3: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84088FD: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84CA9A3: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84088FD: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== Address 0x202e5020 is 128 bytes inside a block of size 1,120 free'd
==22530== at 0x4C2CDFB: free (vg_replace_malloc.c:530)
==22530== by 0x8473376: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8474C46: PyNode_Free (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x83FEA8A: PyParser_ASTFromStringObject (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x83FED41: Py_CompileStringObject (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x840F75A: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8302318: PyCFunction_Call (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8409974: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84CA9A3: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84088FD: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84CA9A3: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84088FD: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== Block was alloc'd at
==22530== at 0x4C2DDEF: realloc (vg_replace_malloc.c:785)
==22530== by 0x8474D62: PyNode_AddChild (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84752FB: PyParser_AddToken (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x847F1B7: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x83FEA57: PyParser_ASTFromStringObject (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x83FED41: Py_CompileStringObject (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x840F75A: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8302318: PyCFunction_Call (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x8409974: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84CA9A3: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84088FD: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
==22530== by 0x84CA9A3: ??? (in /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0)
Not sure if this is even in soccer :/
EDIT: I think this is just a 'problem' with the python interpreter, but It should be fine.
Yeah, that output is very likely normal. http://stackoverflow.com/questions/1519276/is-it-normal-that-running-python-under-valgrind-shows-many-errors-with-memory
Here's another segfault, I'm just posting here for now until we decide to make a new issue or something for it.
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:37
#1 0x00007ffff79874e3 in google::protobuf::internal::ElementCopier<int, true>::operator() (this=0x7fffffffd057, to=0x0, from=0x7fffb003b800, array_size=5)
at /usr/include/google/protobuf/repeated_field.h:848
#2 0x00007ffff79855f3 in google::protobuf::RepeatedField<int>::CopyArray (this=0x7fffffffd710, to=0x0, from=0x7fffb003b800, array_size=5)
at /usr/include/google/protobuf/repeated_field.h:834
#3 0x00007ffff7982c0d in google::protobuf::RepeatedField<int>::MergeFrom (this=0x7fffffffd710, other=...) at /usr/include/google/protobuf/repeated_field.h:732
#4 0x00007ffff79cac2a in Packet::RadioRx::MergeFrom (this=0x7fffffffd6e0, from=...) at common/protobuf/RadioRx.pb.cc:1186
#5 0x00007ffff79c7488 in Packet::RadioRx::RadioRx (this=0x7fffffffd6e0, from=...) at common/protobuf/RadioRx.pb.cc:596
#6 0x00007ffff78c2841 in MainWindow::updateViews (this=0x928390) at ../soccer/MainWindow.cpp:569
#7 0x00007ffff79bbb7c in MainWindow::qt_static_metacall (_o=0x928390, _c=QMetaObject::InvokeMetaMethod, _id=2, _a=0x7fffffffd8a0) at soccer/moc_MainWindow.cpp:317
#8 0x00007ffff68d8fca in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#9 0x00007ffff68e5878 in QTimer::timerEvent(QTimerEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#10 0x00007ffff68d9e53 in QObject::event(QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#11 0x00007ffff6c5505c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#12 0x00007ffff6c5a516 in QApplication::notify(QObject*, QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#13 0x00007ffff68aa62b in QCoreApplication::notifyInternal(QObject*, QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x00007ffff68ff89d in QTimerInfoList::activateTimers() () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#15 0x00007ffff68ffda1 in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#16 0x00007ffff34231a7 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#17 0x00007ffff3423400 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#18 0x00007ffff34234ac in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#19 0x00007ffff6900a7f in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#20 0x00007ffff68a7dea in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#21 0x00007ffff68afe8c in QCoreApplication::exec() () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#22 0x0000000000406165 in main (argc=2, argv=0x7fffffffe138) at ../soccer/main.cpp:196
I'm looking at that and it happens alot less often so its hard to debug, but I think its because we need to add a Mutex lock to Robot->_radioRx so its getting corrupted when we copy and read it from different threads.
I implemented a small fix maybe in the https://github.com/RoboJackets/robocup-software/tree/albert/fixSeg2 . I think this race condition is so low probability that its rather hard to test.
Let me know if you guys get a chance to test those two fixes. I ran it for awhile until ubuntu killed it for taking too much memory on my vm.
Just ran on albert/fixSeg2. This looks like the segfault Jay found earlier.
Thread 14 "QThread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbd091700 (LWP 7794)]
0x0000000000000021 in ?? ()
(gdb) bt
#0 0x0000000000000021 in ?? ()
#1 0x00007fffcdef1581 in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#2 0x00007fffcdbb03dc in ?? () from /usr/lib/python3/dist-packages/sip.cpython-35m-x86_64-linux-gnu.so
#3 0x00007fffcdbb87a3 in sip_api_convert_from_type () from /usr/lib/python3/dist-packages/sip.cpython-35m-x86_64-linux-gnu.so
#4 0x00007fffcdf4fb0f in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#5 0x00007fffcdf4fcbb in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#6 0x00007fffcdf4fcf2 in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#7 0x00007fffcdf52812 in ?? () from /usr/lib/python3/dist-packages/PyQt5/QtCore.cpython-35m-x86_64-linux-gnu.so
#8 0x00007ffff5e1e1b9 in PyCFunction_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#9 0x00007ffff5f38085 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#10 0x00007ffff5f38509 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#11 0x00007ffff5f38509 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#12 0x00007ffff5f38509 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#13 0x00007ffff5fc8c0c in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#14 0x00007ffff5f36e09 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#15 0x00007ffff5f38509 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#16 0x00007ffff5fc8c0c in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#17 0x00007ffff5f36e09 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#18 0x00007ffff5fc8c0c in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#19 0x00007ffff5f36e09 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#20 0x00007ffff5f38509 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#21 0x00007ffff5fc8c0c in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#22 0x00007ffff5fc8ce3 in PyEval_EvalCodeEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#23 0x00007ffff5f3089b in PyEval_EvalCode () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#24 0x00007ffff5f4dd3f in PyRun_StringFlags () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#25 0x00007ffff77b8e31 in Gameplay::GameplayModule::run (this=0x85eba0) at ../soccer/gameplay/GameplayModule.cpp:397
#26 0x00007ffff797b872 in Processor::run (this=0x830590) at ../soccer/Processor.cpp:419
#27 0x00007ffff66c784e in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#28 0x00007ffff640c6fa in start_thread (arg=0x7fffbd091700) at pthread_create.c:333
#29 0x00007ffff521ab5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
I've been testing on master now for a couple hours and it seems that Albert's fixes have patched up all the segfaults I was seeing.
I also want to note that albert said that the fix on fixSeg1 is more of a temporary fix, by running findChild only on setup, we can prevent segfaults that method runs into, but that method shouldn't ever cause a segfault (unless we're corrupting the qt window tree somehwere, by freeing something thats needed).
This may come back and get us later on, we should try to review the QT bits to see if this is an issue.
I'm going to close this issue in 2 months if no one replys back saying they are having any form of qt segfault issues. (we'll probably know pretty quick as all the freshman will be running software soon).
I get this seg fault randomly occasionally when running Basic122. It takes a while. Not really sure whats causing it.