sipwise / rtpengine

The Sipwise media proxy for Kamailio
GNU General Public License v3.0
763 stars 360 forks source link

Using the rtpengine + rtpengine-recording modules for recording. Only the caller side sends flows to the recording module, and the callee side does not. #1788

Closed pengwang7 closed 5 months ago

pengwang7 commented 5 months ago

rtpengine version the issue has been seen with

rtpengine version 10.4.0.0

Used distribution and its version

CentOS Linux release 7.8.2003 (Core)

Linux kernel version used

3.10.0-1127.18.2.el7.x86_64

CPU architecture issue was seen on (see uname -m)

None

Expected behaviour you didn't see

Only the caller side sends flows to the recording module, and the callee side does not.

Unexpected behaviour you saw

Description: 
    Using the rtpengine + rtpengine-recording modules for recording. Only the caller side sends flows to the recording module, and the callee side does not.

I have checked the code and found that when recording starts, the callee side's flows do not reach the stream_packet stage, but are forwarded directly in the kernel. However, from the logs, I can see that the recording request has been opened.
Why does one side's flow not reach the application layer but is forwarded directly by the kernel?
Has anyone encountered a similar problem? How can I continue to investigate it? Please give me some ideas. Thank you.
I haven't encountered it in the intranet, but it appears on cloud servers and the probability is very low.

Steps to reproduce the problem

It's a probability problem. It's hard to reproduce

Additional program output to the terminal or logs illustrating the issue

rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Received command 'offer' from 192.168.0.58:52794
rtpengine[253]: NOTICE: [TMRC.31.1.5235]: [core] Creating new call
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Replying to 'offer' from 192.168.0.58:52794 (elapsed time 0.000296 sec)
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Received command 'answer' from 192.168.0.58:52794
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Replying to 'answer' from 192.168.0.58:52794 (elapsed time 0.000166 sec)
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Received command 'answer' from 192.168.0.58:52794
rtpengine[253]: INFO: [TMRC.31.1.5235 port 22168]: [core] Peer address XXX.138.221.104:47642, ret=0
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] stream XXX.138.221.104:47642
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Kernelizing media stream: XXX.138.221.104:47642 -> 192.168.0.9:22168 | 192.168.0.9:22150 -> 36.138.97.198:22152
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Removing media stream from kernel: local 192.168.0.9:22168
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Replying to 'answer' from 192.168.0.58:52794 (elapsed time 0.000247 sec)
rtpengine[253]: INFO: [TMRC.31.1.5235 port 22168]: [core] Peer address XXX.138.221.104:47642, ret=0
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] stream XXX.138.221.104:47642
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Kernelizing media stream: XXX.138.221.104:47642 -> 192.168.0.9:22168 | 192.168.0.9:22150 -> XXX.232.219.241:9308
rtpengine[253]: INFO: [TMRC.31.1.5235 port 22150]: [core] Peer address XXX.232.219.241:9308, ret=0
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] stream XXX.232.219.241:9308
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Kernelizing media stream: XXX.232.219.241:9308 -> 192.168.0.9:22150 | 192.168.0.9:22168 -> XXX.138.221.104:47642
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Received command 'start recording' from 192.168.0.58:52794
rtpengine[253]: NOTICE: [TMRC.31.1.5235]: [core] Turning on call recording.
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Removing media stream from kernel: local 192.168.0.9:22150
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Removing media stream from kernel: local 192.168.0.9:22168
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Replying to 'start recording' from 192.168.0.58:52794 (elapsed time 0.000456 sec)
rtpengine[253]: INFO: [TMRC.31.1.5235 port 22168]: [core] Peer address XXX.138.221.104:47642, ret=0
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] stream XXX.138.221.104:47642
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Kernelizing media stream: XXX.138.221.104:47642 -> 192.168.0.9:22168 | 192.168.0.9:22150 -> XXX.232.219.241:9308
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Received command 'delete' from 192.168.0.58:52794
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Scheduling deletion of entire call in 3 seconds
rtpengine[253]: INFO: [TMRC.31.1.5235]: [control] Replying to 'delete' from 192.168.0.58:52794 (elapsed time 0.000021 sec)
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Final packet stats:
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] --- Tag '(5235-1-3-2-5-41)_1' (label 'caller'), created 0:16 ago for branch ''
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] ---     subscribed to '(5235-1-3-2-5-41)_2'
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] ---     subscription for '(5235-1-3-2-5-41)_2'
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] ------ Media #1 (audio over RTP/AVP) using PCMA/8000
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] --------- Port     192.168.0.9:22168 <> XXX.138.221.104:47642, SSRC 69cedeb0, 220 p, 37840 b, 0 e, 2 ts
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] --------- Port     192.168.0.9:22169 <> XXX.138.221.104:47643 (RTCP), SSRC 0, 0 p, 0 b, 0 e, 16 ts
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] --- Tag '(5235-1-3-2-5-41)_2' (label 'called'), created 0:16 ago for branch ''
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] ---     subscribed to '(5235-1-3-2-5-41)_1'
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] ---     subscription for '(5235-1-3-2-5-41)_1'
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] ------ Media #1 (audio over RTP/AVP) using PCMA/8000
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] --------- Port     192.168.0.9:22150 <> XXX.232.219.241:9308 , SSRC 42bfcab1, 321 p, 55212 b, 0 e, 6 ts
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] --------- Port     192.168.0.9:22151 <> XXX.232.219.241:9309  (RTCP), SSRC 0, 0 p, 0 b, 0 e, 16 ts
rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Removing media stream from kernel: local 192.168.0.9:22168

Anything else?

No response

rfuchs commented 5 months ago

I have checked the code and found that when recording starts, the callee side's flows do not reach the stream_packet stage, but are forwarded directly in the kernel. However, from the logs, I can see that the recording request has been opened. Why does one side's flow not reach the application layer but is forwarded directly by the kernel?

Recording does not preclude kernel forwarding. The kernel module exports the packets to the recording daemon via the /proc/rtpengine/.../calls interface.

How can I continue to investigate it?

Inspect the contents of /proc/rtpengine/.../list and look for the "intercept" flag to be set. Also inspect /proc/rtpengine/.../calls and check against the logs produced by the recording daemon. You should also enable debug logging.

Closing this as this isn't an actionable bug report.

pengwang7 commented 5 months ago

Thank you for your reply. First of all, I don't think it has anything to do with rtpengine-recording. From the logs I provided above, it seems that the intercept flag has not been set. It is only possible to set the intercept flag when the data packet reaches the application layer. The recording_start function will call __unkernelize, after which the data packet should be able to reach the application layer, and then the intercept flag will be set.

rtpengine[253]: NOTICE: [TMRC.31.1.5235]: [core] Turning on call recording. rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Removing media stream from kernel: local 192.168.0.9:22150 rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Removing media stream from kernel: local 192.168.0.9:22168

After starting recording, only one side of the stream reached the application layer, while the other side did not. I don't know much about xt_RTPENGINE. Should I turn on the debug mode? maybe the problem lies in xt_RTPENGINE?

rtpengine[253]: INFO: [TMRC.31.1.5235 port 22168]: [core] Peer address XXX.138.221.104:47642 rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] stream XXX.138.221.104:47642 rtpengine[253]: INFO: [TMRC.31.1.5235]: [core] Kernelizing media stream: XXX.138.221.104:47642 -> 192.168.0.9:22168 | 192.168.0.9:22150 -> XXX.232.219.241:9308

rfuchs commented 5 months ago

From the logs I provided above, it seems that the intercept flag has not been set.

No, you wouldn't see the intercept flag in the log. At least not without debugging enabled.

It is only possible to set the intercept flag when the data packet reaches the application layer.

No, the intercept flag would be set when the stream is pushed to the kernel module.

The recording_start function will call __unkernelize, after which the data packet should be able to reach the application layer, and then the intercept flag will be set.

No, the intercept flag is set while the stream is being handled by the kernel module.

After starting recording, only one side of the stream reached the application layer, while the other side did not.

There are two application layers. The rtpengine daemon is only involved while the stream is not in the kernel. The recording daemon is the one which needs to see the media packets, and they would always come from the kernel module, either through the intercept flag being set, or through a forwarding mechanism from rtpengine.