jamulussoftware / jamulus

Jamulus enables musicians to perform real-time jam sessions over the internet.
https://jamulus.io
Other
986 stars 221 forks source link

Memory leak: Investigate valgrind output after #1073 #1083

Open hoffie opened 3 years ago

hoffie commented 3 years ago

Describe the bug

Even after #1073, valgrind shows some possible leaks. These should be investigated.

Note: I'm not entirely sure if they are real or relevant leaks. I suppose there may be some ressources which are not destroyed explicitly when Jamlus shuts down. This would be not relevant in the real world, but would probably still be shown as a possible leak by valgrind.

So, I'm not saying that all of the outputs are worth fixing, I'm just proposing to investigate their validity and relevance.

To Reproduce

$ valgrind --leak-check=full ./Jamulus -s -n -p 10001 -e localhost:10000 -l /tmp/log -R /tmp/recordings -o 'Private Server;Stockholm;205' -m /tmp/jam.dat -u 40 -T -w 'Welcome!'

$ for x in {1..12}; do Jamulus -c localhost:10001 --nogui --inifile /tmp/x &; done

# leave running for some minutes
# Ctrl+C and wait for exit (takes a rather long time, possibly due to valgrind overhead + recording finishing?)

Expected behavior No leaks.

Screenshots

==3121== 
==3121== HEAP SUMMARY:
==3121==     in use at exit: 8,560,244 bytes in 1,231 blocks
==3121==   total heap usage: 11,163,830 allocs, 11,162,599 frees, 3,097,366,260 bytes allocated
==3121== 
==3121== 352 bytes in 1 blocks are possibly lost in loss record 110 of 156
==3121==    at 0x483CB65: calloc (vg_replace_malloc.c:760)
==3121==    by 0x40142EB: _dl_allocate_tls (in /usr/lib/ld-2.32.so)
==3121==    by 0x48C913C: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.32.so)
==3121==    by 0x58D79A2: QThread::start(QThread::Priority) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x15C7D2: CServer::CServer(int, QString const&, unsigned short, QString const&, QString const&, QString const&, QString const&, QString const&, QString const&, QString const&, bool, bool, bool, bool, ELicenceType) (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x13CD8F: main (in /home/christian/Projects/jamulus/Jamulus)
==3121== 
==3121== 352 bytes in 1 blocks are possibly lost in loss record 111 of 156
==3121==    at 0x483CB65: calloc (vg_replace_malloc.c:760)
==3121==    by 0x40142EB: _dl_allocate_tls (in /usr/lib/ld-2.32.so)
==3121==    by 0x48C913C: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.32.so)
==3121==    by 0x58D79A2: QThread::start(QThread::Priority) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x156F4F: CServer::Start() (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x5AEC4BE: QObject::event(QEvent*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x5ABFA4F: QCoreApplication::notifyInternal2(QObject*, QEvent*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x5AC2572: QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x5B190A3: ??? (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x7147A83: g_main_context_dispatch (in /usr/lib/libglib-2.0.so.0.6600.4)
==3121==    by 0x719B9B0: ??? (in /usr/lib/libglib-2.0.so.0.6600.4)
==3121==    by 0x71462B0: g_main_context_iteration (in /usr/lib/libglib-2.0.so.0.6600.4)
==3121== 
==3121== 1,408 bytes in 4 blocks are possibly lost in loss record 135 of 156
==3121==    at 0x483CB65: calloc (vg_replace_malloc.c:760)
==3121==    by 0x40142EB: _dl_allocate_tls (in /usr/lib/ld-2.32.so)
==3121==    by 0x48C913C: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.32.so)
==3121==    by 0x58D79A2: QThread::start(QThread::Priority) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x58DA754: QThreadPoolPrivate::startThread(QRunnable*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x58DBB13: QThreadPoolPrivate::tryStart(QRunnable*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x58DC327: QThreadPool::start(QRunnable*, int) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x19276F: QFuture<void> QtConcurrent::run<void, CServer, int, int, int, int, int, int>(CServer*, void (CServer::*)(int, int, int), int const&, int const&, int const&) (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x15A393: CServer::OnTimer() (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x5AEC581: QObject::event(QEvent*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x5ABFA4F: QCoreApplication::notifyInternal2(QObject*, QEvent*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x5AC2572: QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121== 
==3121== 1,824 (16 direct, 1,808 indirect) bytes in 1 blocks are definitely lost in loss record 138 of 156
==3121==    at 0x483ADEF: operator new(unsigned long) (vg_replace_malloc.c:342)
==3121==    by 0x155F3A: recorder::CJamController::SetRecordingDir(QString, int, bool) (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x15B799: CServer::CServer(int, QString const&, unsigned short, QString const&, QString const&, QString const&, QString const&, QString const&, QString const&, QString const&, bool, bool, bool, bool, ELicenceType) (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x13CD8F: main (in /home/christian/Projects/jamulus/Jamulus)
==3121== 
==3121== 9,504 bytes in 27 blocks are possibly lost in loss record 153 of 156
==3121==    at 0x483CB65: calloc (vg_replace_malloc.c:760)
==3121==    by 0x40142EB: _dl_allocate_tls (in /usr/lib/ld-2.32.so)
==3121==    by 0x48C913C: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.32.so)
==3121==    by 0x58D79A2: QThread::start(QThread::Priority) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x58DA754: QThreadPoolPrivate::startThread(QRunnable*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x58DBB13: QThreadPoolPrivate::tryStart(QRunnable*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x58DC327: QThreadPool::start(QRunnable*, int) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x19276F: QFuture<void> QtConcurrent::run<void, CServer, int, int, int, int, int, int>(CServer*, void (CServer::*)(int, int, int), int const&, int const&, int const&) (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x15A015: CServer::OnTimer() (in /home/christian/Projects/jamulus/Jamulus)
==3121==    by 0x5AEC581: QObject::event(QEvent*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x5ABFA4F: QCoreApplication::notifyInternal2(QObject*, QEvent*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121==    by 0x5AC2572: QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) (in /usr/lib/libQt5Core.so.5.15.2)
==3121== 
==3121== LEAK SUMMARY:
==3121==    definitely lost: 16 bytes in 1 blocks
==3121==    indirectly lost: 1,808 bytes in 10 blocks
==3121==      possibly lost: 11,616 bytes in 33 blocks
==3121==    still reachable: 8,546,804 bytes in 1,187 blocks
==3121==         suppressed: 0 bytes in 0 blocks
==3121== Reachable blocks (those to which a pointer was found) are not shown.

Operating system

Version of Jamulus

3.6.2dev-a26ff711 (= #1073)

@softins Are you already planning to look further into this? I can try my luck, but you seem way more proficient in tracking such things down. :)

softins commented 3 years ago

Yes, I am planning to continue the investigation. I'll look at SetRecordingDir first, as it relates to my active PR. The thread ones appear to be related to the multithreading. Could you repeat the test without -T?

softins commented 3 years ago
$ valgrind --leak-check=full ./Jamulus -s -n -p 10001 -e localhost:10000 -l /tmp/log -R /tmp/recordings -o 'Private Server;Stockholm;205' -m /tmp/jam.dat -u 40 -T -w 'Welcome!'

$ for x in {1..12}; do Jamulus -c localhost:10001 --nogui --inifile /tmp/x &; done

# leave running for some minutes
# Ctrl+C and wait for exit (takes a rather long time, possibly due to valgrind overhead + recording finishing?)

I think it would be better to kill the clients first, then leave a significant time for things to settle before terminating the server. If the server is killed with active sessions, we might get false positives, reporting memory that would have been freed on disconnect.

I think the one in SetRecordingDir() might be a case in point.

hoffie commented 3 years ago

I think it would be better to kill the clients first

I think I did, will make sure in my next test tomorrow.

Could you repeat the test without -T?

Will do.

Independent of valgrind I have run a test with 33 clients and no recording/-T. Memory was stable over hours. When adding multithreading there was a sudden increase after some seconds/minutes, but it then seemed stable. I think some more thorough tests should be done, but it would not be high on my list. I will do the repeated valgrind tests though.

ann0see commented 3 years ago

Is this still in progress?

hoffie commented 3 years ago

I did not get around to working on it recently. I still think this should be analyzed properly. Keeping it open, but moving it to Backlog for that reason. Feel free to take over.

ann0see commented 3 years ago

Ok.

genesisproject2020 commented 3 years ago

I have some memory logging on my Jamulus processes and below are three of them from last week. I thought it was a correlation between memory allocation and users. The first one is an open server and the two others are private servers but the config is the same. Something is not ok but I have a hard time to figure out what. I'm running 3.7.0 on the first one and 3.7.0dev-664fd9ed on the other two.

Skärmavbild 2021-04-14 1 Skärmavbild 2021-04-14 30 Skärmavbild 2021-04-14 36
ann0see commented 1 year ago

Tagging as 3.11.0 just to look at it again.

pljones commented 1 year ago

We'll need to validate for each target, too, if we're going to clean up fully. Maybe break it down per target for actual investigation, as it might need different people working on it - make it easier to target fixes coming into different future releases.

pljones commented 1 year ago

@ann0see are you okay to assign yourself to this?

ann0see commented 1 year ago

No. I'm not confident enough with the topic.

pljones commented 1 year ago

OK, maybe we could drop it from 3.11.0 unless someone can pick it up -- it's not something I'd be comfortable with, either.