slsdetectorgroup / slsDetectorPackage

SLS Detector Package
Other
13 stars 8 forks source link

systemd script segv on hostname command #804

Closed thattil closed 9 months ago

thattil commented 1 year ago
*Distribution:
*Detector type:
*Software Package Version:
Priority:
*Describe the bug

From Filip Leonarski: @fleon-psi

systemd script fails on creating/ deleting shared memory segments. does not fail when tried on the console

Sep 18 12:09:23 mx-nextgendcu.psi.ch sls_detector_put[4366]: - 12:09:23.190 INFO: Adding module 10.10.10.201

Sep 18 12:09:23 mx-nextgendcu.psi.ch sls_detector_put[4366]: - 12:09:23.191 WARNING: This shared memory should have been deleted before! /slsDetectorPackage_detector_0_module_0. Freeing it again

Sep 18 12:09:23 mx-nextgendcu.psi.ch sls_detector_put[4366]: - 12:09:23.191 INFO: Shared memory deleted /slsDetectorPackage_detector_0_module_0

Sep 18 12:09:23 mx-nextgendcu.psi.ch sls_detector_put[4366]: - 12:09:23.191 INFO: Shared memory created /slsDetectorPackage_detector_0_module_0

Sep 18 12:09:23 mx-nextgendcu.psi.ch systemd-coredump[4370]: [🡕] Process 4366 (sls_detector_pu) of user 0 dumped core.

Sep 18 12:09:23 mx-nextgendcu.psi.ch systemd[1]: jfjoch_detector.service: Main process exited, code=dumped, status=11/SEGV

Sep 18 12:09:23 mx-nextgendcu.psi.ch systemd[1]: jfjoch_detector.service: Failed with result 'core-dump'.

Sep 18 12:09:23 mx-nextgendcu.psi.ch systemd[1]: jfjoch_detector.service: Scheduled restart job, restart counter is at 4.

Sep 18 12:09:23 mx-nextgendcu.psi.ch systemd[1]: Stopped jfjoch_receiver.

It is running as root in systemd, so permission issue is unlikely.

If I do it from console all is OK:

[root@mx-nextgendcu bin]# LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/jfjoch/lib64 ./sls_detector_put hostname 10.10.10.201+

hostname [10.10.10.201]

Temp solution they use: screen session running code. Need long term solution

Expected behavior
To Reproduce
Screenshots
Additional context
fleon-psi commented 9 months ago

The issue is not related to shared memory.

Problem is in Module.cpp:3335:

strcpy_safe(shm()->settingsDir, getenv("HOME"));

For systemd HOME variable doesn't exist, so getenv returns nullptr, which segfaults when trying to read in strcpy_safe.

thattil commented 9 months ago

@fleon-psi ah wow! that should be an easy fix, we can check if it exists, if not, we can put it as '/' as the default. One can overwrite it with a command if they need it. But this will go into developer and we can make a branch for you of 8.0.1.

fleon-psi commented 9 months ago

No worries 🙂. I solved it in my own fork, so for me it is perfectly fine to have it solved in the next release. I've made a pull request back to main repo if you like to test it at some point on your side


From: Dhanya Thattil @.> Sent: Monday, February 5, 2024 4:12 PM To: slsdetectorgroup/slsDetectorPackage @.> Cc: Leonarski Filip @.>; Mention @.> Subject: Re: [slsdetectorgroup/slsDetectorPackage] systemd script segv on hostname command (Issue #804)

@fleon-psihttps://github.com/fleon-psi ah wow! that should be an easy fix, we can check if it exists, if not, we can put it as '/' as the default. One can overwrite it with a command if they need it. But this will go into developer and we can make a branch for you of 8.0.1.

— Reply to this email directly, view it on GitHubhttps://github.com/slsdetectorgroup/slsDetectorPackage/issues/804#issuecomment-1927223501, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGVIBJDOHBTZ2BYTY3KMPY3YSDZFPAVCNFSM6AAAAAA46CUTFCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRXGIZDGNJQGE. You are receiving this because you were mentioned.Message ID: @.***>

thattil commented 9 months ago

Sounds good! Thanks a lot for resolving this! :)