szatmary / libcaption

Free open-source CEA608 / CEA708 closed-caption encoder/decoder
MIT License
150 stars 64 forks source link

Restreaming RTMP to RTMP with captions #67

Open ramiel opened 1 year ago

ramiel commented 1 year ago

This is not an issue, I'm just looking for help. Sorry if this is not the correct channel. I'm trying to get an input video coming from an RTMP endpoint and restreaming to another after adding srt file. I'm using a command like this one

ffmpeg -i rtmp://localhost:1935/live/test -codec copy -f flv - | ./flv+srt - ../../sub.srt - | ffmpeg -i - -codec copy -y -f flv rtmp://a.rtmp.youtube.com/live2/aabbcc

The restreaming works fine if I omit the flv+srt part, but it fails if I use the command above. The RTMP never receives the stream and this is printed in console.

Loaded new SRT at time 0.000000=       1kB time=00:00:00.00 bitrate=N/A speed=   0x    
T: 0.07: [CAPTIONS CLEARED]
T: 5.40 (10.00s):41 q=-1.0 size=    1582kB time=00:00:04.86 bitrate=2665.4kbits/s speed=1.36x    
This is an example of
a subtitle.

T: 15.53: [CAPTIONS CLEARED]ize=    4917kB time=00:00:15.08 bitrate=2670.7kbits/s speed=1.09x    
[flv @ 0x7face391d140] Packet mismatch 216678557 6386798 6386798ate=2668.7kbits/s speed=1.07x    
T: 20.40 (99.00s):2 q=-1.0 size=    6607kB time=00:00:20.18 bitrate=2682.2kbits/s speed=1.07x    
This is an example of
a subtitle - 2nd subtitle.
[flv @ 0x7face391d140] Packet mismatch -527437488 5557771 11944569e=2697.8kbits/s speed=1.04x    
[NULL @ 0x7facc2f05f40] missing picture in access unit with size 10178
[extract_extradata @ 0x7facd3006680] No start code is found.
pipe:: could not find codec parameters
Input #0, flv, from 'pipe:':
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream #0:0: Audio: aac, 44100 Hz, stereo
  Stream #0:1: Video: h264, none, 1k tbn
[flv @ 0x7facd3006800] dimensions not setB time=00:00:35.98 bitrate=2694.2kbits/s speed=1.04x    
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:1 -- 
Stream mapping:
  Stream #0:1 -> #0:0 (copy)
  Stream #0:0 -> #0:1 (copy)
    Last message repeated 1 times
av_interleaved_write_frame(): Broken pipe
[flv @ 0x7fcda1304c40] Failed to update header with correct duration.
[flv @ 0x7fcda1304c40] Failed to update header with correct filesize.
Error writing trailer of pipe:: Broken pipe
frame= 1083 fps= 31 q=-1.0 Lsize=   11837kB time=00:00:36.03 bitrate=2691.3kbits/s speed=1.03x    
video:11093kB audio:704kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.343468%
Error closing file pipe:: Broken pipe
Conversion failed!

Do you have any idea about what the cause can be?

ramiel commented 1 year ago

I'm realising this is maybe a duplicate of #55 but that issue has no answer yet

ErikBahena commented 5 days ago

Hey @ramiel did you ever come across a solution?

ramiel commented 5 days ago

Not really. I looked elsewhere but without any luck. What about you?

ErikBahena commented 5 days ago

@ramiel I saw a bit of success using gst to produce a video. The insert times are all off. But It's something about this profile that the flv+srt can work with:

gst-launch-1.0 videotestsrc pattern=ball num-buffers=300 ! \
    videoscale ! \
    video/x-raw, width=1920, height=1080 ! \
    videoconvert ! \
    x264enc tune=zerolatency bitrate=3000 key-int-max=30 ! \
    flvmux streamable=true ! \
    filesink location=output_10s.flv

Video profile of output_10s.flv:

erikbahena@IMMM tests % mediainfo output_10s.flv    
General
Complete name                            : output_10s.flv
Format                                   : Flash Video
File size                                : 205 KiB
Duration                                 : 9 s 999 ms
Overall bit rate                         : 168 kb/s
Frame rate                               : 30.000 FPS
Encoded date                             : 2024-11-22 00:04:56
Writing application                      : GStreamer 1.24.9 FLV muxer
Tagging application                      : GStreamer 1.24.9 FLV muxer
AspectRatioX                             : 1.000
AspectRatioY                             : 1.000

Video
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High 4:4:4 Predictive@L4
Format settings                          : CABAC / 3 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 3 frames
Format settings, GOP                     : M=1, N=30
Format settings, Slice count             : 8 slices per frame
Codec ID                                 : 7
Duration                                 : 9 s 999 ms
Bit rate                                 : 3 000 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 30.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:4:4
Bit depth                                : 10 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.048
Stream size                              : 3.58 MiB
Writing library                          : x264 core 164 r3108 31e19f9
Encoding settings                        : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=4 / threads=8 / lookahead_threads=8 / sliced_threads=1 / slices=8 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=0 / weightp=2 / keyint=30 / keyint_min=3 / scenecut=40 / intra_refresh=0 / rc_lookahead=0 / rc=cbr / mbtree=0 / bitrate=3000 / ratetol=1.0 / qcomp=0.60 / qpmin=0 / qpmax=81 / qpstep=4 / vbv_maxrate=3000 / vbv_bufsize=1800 / nal_hrd=none / filler=0 / ip_ratio=1.40 / aq=1:1.00
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709

Then I added some logs and changes to the flv+srt file:

#include "flv.h"
#include "mpeg.h"
#include "srt.h"

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define MAX_SRT_SIZE (1 * 1024 * 1024)
#define CLEAR_TIMESTAMP_TOLERANCE 0.01

size_t fd_read(int fd, uint8_t* data, size_t size, int* eof) {
    fd_set rfds;
    struct timeval tv;
    int retval;

    (*eof) = 0;
    FD_ZERO(&rfds);
    FD_SET(fd, &rfds);
    tv.tv_sec = 0;
    tv.tv_usec = 1;
    retval = select(fd + 1, &rfds, NULL, NULL, &tv);

    if (retval < 0) {
        return retval;
    }

    // Not ready
    if (!(retval && FD_ISSET(fd, &rfds))) {
        return 0;
    }

    retval = read(fd, data, size);

    if (retval == 0) {
        (*eof) = 1;
    }

    return retval;
}

size_t g_srt_size = 0;
utf8_char_t g_srt_data[MAX_SRT_SIZE];
srt_t* srt_from_fd(int fd) {
    int eof;
    uint8_t c;

    while (1) {
        int ret = fd_read(fd, &c, 1, &eof);

        if (eof || (ret == 1 && c == 0)) {
            srt_t* srt = srt_parse(&g_srt_data[0], g_srt_size);
            g_srt_size = 0;
            return srt;
        }

        if (ret == 1) {
            if (g_srt_size >= MAX_SRT_SIZE - 1) {
                fprintf(stderr, "Warning: MAX_SRT_SIZE reached. Clearing buffer.\n");
                g_srt_size = 0;
            }

            g_srt_data[g_srt_size] = c;
            g_srt_size += 1;
        } else {
            return NULL;
        }
    }
}

int main(int argc, char** argv) {
    flvtag_t tag;
    srt_t* old_srt = NULL;
    srt_cue_t* next_cue = NULL;
    double timestamp, offset = 0, clear_timestamp = -1;
    int has_audio, has_video;
    FILE* flv = flv_open_read(argv[1]);
    int fd = open(argv[2], O_RDWR);
    FILE* out = flv_open_write(argv[3]);

    if (!flv || fd < 0 || !out) {
        fprintf(stderr, "Error: Unable to open files.\n");
        return EXIT_FAILURE;
    }

    flvtag_init(&tag);

    if (!flv_read_header(flv, &has_audio, &has_video)) {
        fprintf(stderr, "%s is not a valid FLV file.\n", argv[1]);
        return EXIT_FAILURE;
    }

    flv_write_header(out, has_audio, has_video);

    fprintf(stderr, "Reading FLV from %s\n", argv[1]);
    fprintf(stderr, "Reading captions from %s\n", argv[2]);
    fprintf(stderr, "Writing FLV to %s\n", argv[3]);

    while (flv_read_tag(flv, &tag)) {
        timestamp = flvtag_pts_seconds(&tag);
        fprintf(stderr, "Processing tag at timestamp %.02f seconds\n", timestamp);

        // Check for new SRT file updates
        srt_t* cur_srt = srt_from_fd(fd);
        if (cur_srt) {
            fprintf(stderr, "Loaded new SRT at time %.02f seconds\n", timestamp);
            if (old_srt) {
                srt_free(old_srt);
            }
            old_srt = cur_srt;
            offset = timestamp;
            clear_timestamp = -1; // Reset clearing timestamp
            next_cue = cur_srt->cue_head;
        }

        // Handle video tags with NALU packets
        if (flvtag_avcpackettype_nalu == flvtag_avcpackettype(&tag)) {
            fprintf(stderr, "NALU packet detected at %.02f seconds\n", timestamp);

            // Add a caption if it's time
            if (next_cue && (offset + next_cue->timestamp) <= timestamp) {
                fprintf(stderr, "Adding caption at %.02f (%.02fs duration): %s\n",
                        (offset + next_cue->timestamp), next_cue->duration, srt_cue_data(next_cue));
                clear_timestamp = (offset + next_cue->timestamp) + next_cue->duration;
                flvtag_addcaption_text(&tag, srt_cue_data(next_cue));
                next_cue = next_cue->next;
            }
            // Clear captions if needed
            else if (clear_timestamp > 0 && (clear_timestamp - timestamp) <= CLEAR_TIMESTAMP_TOLERANCE) {
                fprintf(stderr, "Clearing captions at %.02f seconds\n", timestamp);
                flvtag_addcaption_text(&tag, NULL);
                clear_timestamp = -1;
            }
        }

        // Ensure captions are added to keyframes
        if (flvtag_frametype_keyframe == flvtag_frametype(&tag)) {
            fprintf(stderr, "Keyframe detected at %.02f seconds\n", timestamp);
        }

        // Write the processed tag to the output
        fprintf(stderr, "Writing tag at timestamp %.02f seconds\n", timestamp);
        if (!flv_write_tag(out, &tag)) {
            fprintf(stderr, "Error writing tag at timestamp %.02f seconds.\n", timestamp);
        }
    }

    // Cleanup
    fprintf(stderr, "Finished processing FLV file.\n");
    if (old_srt) {
        srt_free(old_srt);
    }
    flvtag_free(&tag);
    flv_close(flv);
    flv_close(out);

    return EXIT_SUCCESS;
}