OpenVisualCloud / Media-Transport-Library

A real-time media transport(DPDK, AF_XDP, RDMA) stack for both raw and compressed video based on COTS hardware.
BSD 3-Clause "New" or "Revised" License
158 stars 46 forks source link

Losing packets/frames with FFmpeg plugin #870

Closed PanKaker closed 1 month ago

PanKaker commented 2 months ago

Context

Hi!

We would you like to use your library with ffmpeg plugin, but we noticed that there are a few frames drops in RX side.

We've tried to use different duration of videos: 5, 10, 30, 400 seconds and the amount of dropping frames was always around 3. We have analyzed the frames that we're missing and it was the last frames from the sequence of a video.

Please find logs from TX and RX side below.

Thoughts on the problem

The issues is likely connected with finishing the session from TX/RX side and no proper flushing/burst packets from the buffers at the end of the session.

Possible workaround

Please be aware that this is not the perfect solution and we would like to get your expertise/thoughts on that.

As a workaround: tx_video_session_get/ tx_video_session_putare called to get the number of session inflight packets and we are calling usleep untill all the inflight packets are sent.

We added a new function st_tx_wait_for_inbound in mtl_st20p_write_close to achieve this:

static int mtl_st20p_write_close(AVFormatContext* ctx) {
   dbg("%s(%d), start\n", __func__, s->idx);
   // Destroy tx session
   if (s->tx_handle) {
+    st_tx_wait_for_inbound(s->tx_handle->impl);
int st_tx_wait_for_inbound(struct mtl_main_impl* impl){
  struct mtl_sch_impl *sch;
  struct st_tx_video_sessions_mgr* tx_mgr;
  int ret;

  for (int sch_idx = 0; sch_idx < MT_MAX_SCH_NUM; sch_idx++) {
    sch = mt_sch_instance(impl, sch_idx);
    tx_mgr = &sch->tx_video_mgr;
    for (int j = 0; j < tx_mgr->max_idx; j++) {
      ret = st_tx_session_wait_for_inbound(tx_mgr, j);
      if(ret)
        return ret;
    }
  }
  return 0;
}
static int st_tx_session_wait_for_inbound(struct st_tx_video_sessions_mgr* tx_mgr,
                                           int sch_idx) {
  struct st_tx_video_session_impl* s = tx_video_session_get(tx_mgr, sch_idx);
  int num_port = s->ops.num_port;

  for (int i = 0; i < num_port; i++) {
    for (int j = 0; j < ST_TX_VIDEO_WAIT_FOR_INBOUND_RETRY; j++) {
      if (s->trs_inflight_num[i] || s->trs_inflight_num2[i]) {
        tx_video_session_put(tx_mgr, sch_idx);
        notice("%s(%d), Retry %d out of %d \n", __func__, sch_idx, j, ST_TX_VIDEO_WAIT_FOR_INBOUND_RETRY);
        usleep(10000);
        tx_video_session_get(tx_mgr, sch_idx);
      } else {
        tx_video_session_put(tx_mgr, sch_idx);
        break;
      }
    }
  }

  return 0;
}

Logs

FFmpeg st20 TX

Test-video is a 10 seconds video with 25 fps = 125 frames.

Command

ffmpeg -video_size 1920x1080 -f rawvideo -pix_fmt yuv422p10le -i test_video.yuv -filter:v fps=25 -p_port 0000:b1:00.1 -p_sip 192.168.96.3 -p_tx_ip 239.168.85.20 -udp_port 20000 -payload_type 112 -f mtl_st20p -

Output

[mtl_st20p @ 0x605ffa104c40] mtl_st20p_write_header(0), tx_handle 0x3202c07440
Output #0, mtl_st20p, to 'pipe:':
  Metadata:
    encoder         : Lavf60.16.100
  Stream #0:0: Video: rawvideo (Y3[10][10] / 0xA0A3359), yuv422p10le(progressive), 1920x1080, q=2-31, 1036800 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc60.31.102 rawvideo
MTL: 2024-05-20 10:30:01, st20p_tx_free(0), start.56 bitrate=N/A speed=0.539x
MTL: 2024-05-20 10:30:01, st20_tx_free(0,0), start
MTL: 2024-05-20 10:30:01, TX_VIDEO_SESSION(0,0:st20p_ffmpge): fps 24.600933 frames 122 pkts 505218:0 inflight 30282:30305
MTL: 2024-05-20 10:30:01, TX_VIDEO_SESSION(0,0): throughput 1077.382235 Mb/s: 0.000000 Mb/s, cpu busy 0.358991
MTL: 2024-05-20 10:30:01, TX_VIDEO_SESSION(0,0): mismatch epoch troffset 1
MTL: 2024-05-20 10:30:01, TX_VIDEO_SESSION(0,0): epoch drop 1
MTL: 2024-05-20 10:30:01, TX_VIDEO_SESSION(0,0): busy as no ready frame from user 1
MTL: 2024-05-20 10:30:01, mt_ring_dequeue_clean, count 508 for ring TV_M0S0P0
MTL: 2024-05-20 10:30:01, mt_dpdk_flush_tx_queue(0), queue 1 burst_pkts 1024
MTL: 2024-05-20 10:30:01, mt_dev_put_tx_queue(0), q 1
MTL: 2024-05-20 10:30:01, mt_mempool_free, free mempool TV_M0S0_CHAIN_0_11
MTL: 2024-05-20 10:30:01, mt_mempool_free, free mempool TV_M0S0P0_HDR_0_10
MTL: 2024-05-20 10:30:01, sch_free_quota(0), quota 1080 total now 0
MTL: 2024-05-20 10:30:01, st20_tx_free(0,0), succ
MTL: 2024-05-20 10:30:02, st20p_tx_free(0), succ
[mtl_st20p @ 0x605ffa104c40] mtl_instance_put, handle 0x3200b81140 ref cnt 0
[mtl_st20p @ 0x605ffa104c40] mtl_instance_put, ref cnt reach zero, uninit mtl device
MTL: 2024-05-20 10:30:02, video_trs_tasklet_stop(0), succ
MTL: 2024-05-20 10:30:02, sch_tasklet_func(0), end with 3 tasklets
MTL: 2024-05-20 10:30:02, cni_traffic_thread, start
MTL: 2024-05-20 10:30:02, mt_sch_put_lcore, succ on shm lcore 36 for lib_sch
MTL: 2024-05-20 10:30:02, sch_stop(0), succ
MTL: 2024-05-20 10:30:02, mt_sch_stop_all, succ
MTL: 2024-05-20 10:30:02, _mt_stop, succ
MTL: 2024-05-20 10:30:02, admin_thread, stop
MTL: 2024-05-20 10:30:02, mtl_sch_unregister_tasklet(0), tasklet cni(0) unregistered
MTL: 2024-05-20 10:30:02, cni_traffic_thread, stop
MTL: 2024-05-20 10:30:02, mt_dev_put_rx_queue(0), q 0
MTL: 2024-05-20 10:30:02, mt_cni_uinit, succ
MTL: 2024-05-20 10:30:02, sch_free_quota(0), quota 0 total now 0
MTL: 2024-05-20 10:30:02, mt_sch_put(0), ref_cnt now zero
MTL: 2024-05-20 10:30:02, Warn: sch_stop(0), not started
MTL: 2024-05-20 10:30:02, mtl_sch_unregister_tasklet(0), tasklet video_transmitter(2) unregistered
MTL: 2024-05-20 10:30:02, st_video_transmitter_uinit(0), succ
MTL: 2024-05-20 10:30:02, mtl_sch_unregister_tasklet(0), tasklet tx_video_sessions_mgr(1) unregistered
MTL: 2024-05-20 10:30:02, tv_mgr_uinit(0), succ
MTL: 2024-05-20 10:30:02, sch_free(0), start to free sch: sch_0
MTL: 2024-05-20 10:30:02, mt_dpdk_flush_tx_queue(0), queue 0 burst_pkts 1024
MTL: 2024-05-20 10:30:02, mt_dev_put_tx_queue(0), q 0
MTL: 2024-05-20 10:30:02, dev_stop_port(0), succ
MTL: 2024-05-20 10:30:02, mt_dev_free, succ
MTL: 2024-05-20 10:30:02, mt_main_free, succ
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q0_MBUF_1
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q1_MBUF_2
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q2_MBUF_3
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q3_MBUF_4
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q4_MBUF_5
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q5_MBUF_6
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q6_MBUF_7
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q7_MBUF_8
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool R_P0Q8_MBUF_9
MTL: 2024-05-20 10:30:02, mt_mempool_free, free mempool T_P0_SYS_0
MTL: 2024-05-20 10:30:02, dev_close_port(0), succ
MTL: 2024-05-20 10:30:02, stat_thread, stop
MTL: 2024-05-20 10:30:02, mt_dev_uinit, succ
MTL: 2024-05-20 10:30:02, mtl_uninit, succ
[mtl_st20p @ 0x605ffa104c40] mtl_st20p_write_close(0), frame_counter 125
[out#0/mtl_st20p @ 0x605ffa104480] video:1012500kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
frame=  125 fps= 13 q=-0.0 Lsize=N/A time=00:00:04.96 bitrate=N/A speed=0.532x

FFmpeg st20 RX

Command

ffmpeg -p_port 0000:b1:00.0 -p_sip 192.168.96.2 -p_rx_ip 239.168.85.20 -udp_port 20000 -payload_type 112 -fps 25 -pix_fmt yuv422p10le -video_size 1920x1080 -f mtl_st20p -i "k" -f rawvideo /dev/null -y

Output

rame=  117 fps= 22 q=-0.0 size=  939520kB time=00:00:04.68 bitrate=1644561.        [mtl_st20p @ 0x60c475184c40] mtl_st20p_read_packet(0), st20p_rx_get_frame ti        meout
[in#0/mtl_st20p @ 0x60c475184b00] Error during demuxing: Input/output error
[in#0/mtl_st20p @ 0x60c475184b00] Error retrieving a packet from demuxer: In        put/output error
frame=  121 fps= 19 q=-0.0 size=  971776kB time=00:00:04.80 bitrate=1658497.        [out#0/rawvideo @ 0x60c475153ac0] video:988200kB audio:0kB subtitle:0kB othe        r streams:0kB global headers:0kB muxing overhead: 0.000000%
frame=  122 fps= 19 q=-0.0 Lsize=  988200kB time=00:00:04.84 bitrate=1672589        .8kbits/s speed=0.755x
MTL: 2024-05-20 10:30:02, st20p_rx_free(0), start
MTL: 2024-05-20 10:30:02, st20_rx_free(0,0), start
MTL: 2024-05-20 10:30:02, RX_VIDEO_SESSION(0,0:st20p_rx_ffmpeg): fps 16.3982        93 frames 122 pkts 504703
MTL: 2024-05-20 10:30:02, RX_VIDEO_SESSION(0,0): throughput 534.291256 Mb/s,         cpu busy 0.478655
MTL: 2024-05-20 10:30:02, RX_VIDEO_SESSION(0,0): succ burst max 128, avg 1.0        00420
MTL: 2024-05-20 10:30:02, mt_mcast_leave(0), delete group 1455a8ef
MTL: 2024-05-20 10:30:02, mcast_membership_report_on_action(0), send leave p        kt, mb_report_len 16
MTL: 2024-05-20 10:30:02, mt_mcast_leave(0), leave group 239.168.85.20
MTL: 2024-05-20 10:30:02, mt_dev_put_rx_queue(0), q 1
MTL: 2024-05-20 10:30:02, sch_free_quota(0), quota 1620 total now 0
MTL: 2024-05-20 10:30:02, st20_rx_free, succ on sch 0 session 0
MTL: 2024-05-20 10:30:02, st20p_rx_free(0), succ
[mtl_st20p @ 0x60c475184c40] mtl_instance_put, handle 0x3200b81140 ref cnt 0
[mtl_st20p @ 0x60c475184c40] mtl_instance_put, ref cnt reach zero, uninit mt        l device
MTL: 2024-05-20 10:30:02, sch_tasklet_func(0), end with 2 tasklets
MTL: 2024-05-20 10:30:02, cni_traffic_thread, start
MTL: 2024-05-20 10:30:03, mt_sch_put_lcore, succ on shm lcore 37 for lib_sch
MTL: 2024-05-20 10:30:03, sch_stop(0), succ
MTL: 2024-05-20 10:30:03, mt_sch_stop_all, succ
MTL: 2024-05-20 10:30:03, _mt_stop, succ
MTL: 2024-05-20 10:30:03, admin_thread, stop
MTL: 2024-05-20 10:30:03, mtl_sch_unregister_tasklet(0), tasklet cni(0) unre        gistered
MTL: 2024-05-20 10:30:03, cni_traffic_thread, stop
MTL: 2024-05-20 10:30:03, mt_dev_put_rx_queue(0), q 0
MTL: 2024-05-20 10:30:03, mt_cni_uinit, succ
MTL: 2024-05-20 10:30:03, sch_free_quota(0), quota 0 total now 0
MTL: 2024-05-20 10:30:03, mt_sch_put(0), ref_cnt now zero
MTL: 2024-05-20 10:30:03, Warn: sch_stop(0), not started
MTL: 2024-05-20 10:30:03, mtl_sch_unregister_tasklet(0), tasklet rvs_pkt_rx(        1) unregistered
MTL: 2024-05-20 10:30:03, rvs_mgr_uinit(0), succ
MTL: 2024-05-20 10:30:03, sch_free(0), start to free sch: sch_0
MTL: 2024-05-20 10:30:03, mt_dpdk_flush_tx_queue(0), queue 0 burst_pkts 1024
MTL: 2024-05-20 10:30:03, mt_dev_put_tx_queue(0), q 0
MTL: 2024-05-20 10:30:03, sch_lcore_shm_uinit, remove shared memory as we ar        e the last user
MTL: 2024-05-20 10:30:03, dev_stop_port(0), succ
MTL: 2024-05-20 10:30:03, mt_dev_free, succ
MTL: 2024-05-20 10:30:03, mt_main_free, succ
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q0_MBUF_1
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q1_MBUF_2
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q2_MBUF_3
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q3_MBUF_4
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q4_MBUF_5
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q5_MBUF_6
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q6_MBUF_7
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q7_MBUF_8
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool R_P0Q8_MBUF_9
MTL: 2024-05-20 10:30:03, mt_mempool_free, free mempool T_P0_SYS_0
MTL: 2024-05-20 10:30:03, dev_close_port(0), succ
MTL: 2024-05-20 10:30:03, stat_thread, stop
MTL: 2024-05-20 10:30:03, mt_dev_uinit, succ
MTL: 2024-05-20 10:30:03, mtl_uninit, succ
[mtl_st20p @ 0x60c475184c40] mtl_st20p_read_close(0), frame_counter 122
frankdjx commented 2 months ago

It appears that we must wait for the last frame to complete its transmission processing before we can close all resources. Thank you; let's work on a solution for this issue.

By the way, you mentioned that 3 frames were dropped on the RX side. Could you specify which frames, other than the last one, were affected also?

PanKaker commented 2 months ago

As a proposition, maybe it's possible to implement in API of IMTL/DPDK a func that waits for all packets to be bursted/flushed and then we can close/free the resources. It will be really helpful.

About 3 frames: After receiving a video in RX side, we saved it to the file and broke by frames with ffmpeg, it appeared that only last 3 frames (123,124,125) were missing. I will try to run this again to confirm it and will back to you.

UPD: Yes, I can confirm losing the last 3 frames from the sequence.

frankdjx commented 2 months ago

Can you rebase the IMTL code to latest? https://github.com/OpenVisualCloud/Media-Transport-Library/pull/871 should fix this problem.

PanKaker commented 2 months ago

Hi! Thank you for the fast fix! We will try to test it and I will reach out to you!

P.S. I think the problem also occurs in st22 proto/pipeline

PanKaker commented 2 months ago

Hi! We've tried to run st20 again with the last fix and we can confirm that all frames were delivered correctly.

frankdjx commented 1 month ago

st22p and st30p is fixed also. Close this. Open a new one if it has any other issue.