sipcapture / captagent

100% Open-Source Packet Capture Agent for HEP
https://sipcapture.org
GNU Affero General Public License v3.0
167 stars 75 forks source link

Problem with systemd captagent #249

Closed perrfect closed 2 years ago

perrfect commented 2 years ago

Hello. Few days ago I noticed that captagent on one server doesn't gathering traffic, but systemd service is staying in active status. In logs i see:

kernel: captagent[3606426]: segfault at 7f7ff385b199 ip 00007f7ef6b6ae04 sp 00007f7ef337ab08 error 4 in libc-2.28.so[7f7ef6a9b000+1bc000]             
kernel: Code: 7f 07 c5 fe 7f 4f 20 c5 fe 7f 54 17 e0 c5 fe 7f 5c 17 c0 c5 f8 77 c3 48 39 f7 0f 87 ab 00 00 00 0f 84 e5 fe ff ff c5 fe 6f 26 <c5> fe 6f
 6c 16 e0 c5 fe 6f 74 16 c0 c5 fe 6f 7c 16 a0 c5 7e 6f 44                                                                                                                                     
systemd[1]: Created slice system-systemd\x2dcoredump.slice.                                                                                           
systemd[1]: Started Process Core Dump (PID 3376071/UID 0).                                                                                            
systemd-coredump[3376072]: Resource limits disable core dumping for process 3606420 (captagent).                                                      
systemd-coredump[3376072]: Process 3606420 (captagent) of user 0 dumped core.                                                                         
systemd[1]: systemd-coredump@0-3376071-0.service: Succeeded.                                                                                          
kernel: device capture left promiscuous mode                                                                                                          
kernel: device enp131s0f1np3 left promiscuous mode 

I'm monitoring this service only as systemd.unit.is-active. Maybe are there tools or heath checks how can monitoring this service additional?

lmangani commented 2 years ago

@perrfect what is the Restart policy on the service?

perrfect commented 2 years ago

@perrfect what is the Restart policy on the service?

Hello. Thank you for your reply. The policy was Restart=no. We have changed to Restart=always. Hope everything will be ok.

adubovikov commented 2 years ago

@perrfect can you do gdb on the core file and send us the bt full report ?

https://www.thegeekstuff.com/2014/01/gdb-backtrace/

adubovikov commented 2 years ago

or if you use systemd-coredump, just run

coredumpctl gdb

and after bt full

perrfect commented 2 years ago

or if you use systemd-coredump, just run

coredumpctl gdb

and after bt full

Hello. Unfortunately we didn't enable core dump on this server and only have logs which I wrote above.

adubovikov commented 2 years ago

without back trace it's hard to fix :-(

perrfect commented 2 years ago

I understand. We have already enable core dumps on this server and if the problems repeat I will send you the dump.