obsproject / obs-studio

OBS Studio - Free and open source software for live streaming and screen recording
https://obsproject.com
GNU General Public License v2.0
58.86k stars 7.83k forks source link

SendCaptions not working #5191

Open alexfrench opened 3 years ago

alexfrench commented 3 years ago

Operating System Info

Windows 10

Other OS

No response

OBS Studio Version

27.0.1

OBS Studio Version (Other)

No response

OBS Studio Log URL

https://obsproject.com/logs/oeSC2zsH3QUec5sK

OBS Studio Crash Log URL

No response

Expected Behavior

I am using websockets to send captions. Websockets uses SendCaptions to send the caption.

The caption builds one word at a time.

I have set the system to send captions at long intervals for test purposes.

I would expect the captions to appear on the video correctly.

Current Behavior

Captions should appear in video output. After a few seconds the output fails, as follows:

Sending the following text works: This is an automated test caption from your captione

Sending the following text works: This is an automated test caption from your captione

Sending the following text does not work: This is an automated test caption from your captioner

etc

The problem appears to be linked to the number of characters in the line and can be repeated fully. I haves tested with Youtube and Facebook

Steps to Reproduce

  1. Using SendCaption sending the following text works: This is an automated test caption from your captione

  2. Sending the following text does not work: This is an automated test caption from your captioner

Anything else we should know?

No response

Fenrirthviti commented 3 years ago

Did you mean to open this against the obs-websockets repo? It's currently a third party plugin that is not part of the main project.

willlllllio commented 3 years ago

OBS CEA captions being pretty broken on Youtube and FB is a known issue, was also discussed in the discord a while ago https://github.com/obsproject/obs-studio/issues/4006

So should be unrelated to OBS websockets plugin specifically which is basically just wrapping the OBS captions function but duplicate of the above OBS issue.

alexfrench commented 3 years ago

Thanks. Yes, I noted that I was using websockets as a method of reproducing the issue. There's nothing in websockets that I could see that would be causing the problem. Is anyone working on fixing this issue?

alexfrench commented 3 years ago

To clarify - I opened the issue here because it is a problem with obs-studio not the websockets plugin. If this is not being worked on, I would be happy to look at it.

WizardCM commented 3 years ago

There currently isn't anyone actively looking at caption issues in OBS as far as I know. If you have the drive, please do!

VodBox commented 3 years ago

Looking in more detail at captions is in my list, but that won't be for a while. If you decide to look at it, you won't be stepping on anyone's toes.

tvadi commented 2 years ago

Hi, looking at trying to get good captions out of OBS as reported here https://github.com/obsproject/obs-studio/issues/4006 and here https://github.com/ratwithacompiler/OBS-captions-plugin/issues/16 Wondering is it is an issue with the captions plug in or OBS/ libcaption as suggested here: https://github.com/ratwithacompiler/OBS-captions-plugin/issues/51 CEA compliant captions are pretty crucial, is there any way to get good captions out OBS? I have ffmpeg built with libklvanc and can load the SRT stream but it continually gives: Illegal cc_count received error.

tvadi commented 2 years ago

Hi, looks like the fix could be FIFO and vf_ccrepack ?

https://github.com/obsproject/obs-studio/issues/4006

tvadi commented 2 years ago

Copied from emails looking into this problem if anyone is interested.. Thanks to Devon at libklvanc...

1- With a 50 second SRT raw TS file- Only 105 out of the 2802 frames of video contained caption data. For 720p59, every frame is supposed to contain 10 tuples of data (i.e. cc_count of 10 translating to 30 bytes of data). 2- In the frames that did contain caption data, the cc_count was a very large value (on the order of 53), and there appears to have been no effort to pack only one 608 tuple per packet. 3- There appears to be no A53 padding bytes in the stream at all. He further estimates that the implementation is not doing any rate control. The result is the caption writer receives a sentence worth of text, generates a series of 608 pairs, and then just inserts the entire series into the next available frame. Where the correct behavior would be to queue out the 608 pairs one per frame, adding padding as needed to reach the expected number of bytes per frame.

To summarize the spec- for 720p59 video you should have 30 bytes of data per frame (i.e. a cc_count of 10), and each frame should contain exactly one 608 tuple (alternating between CC1/CC3 and CC2/CC4 on each frame). For 1080i59 it's 60 bytes per frame (cc_count=20), and each frame should contain exactly two 608 tuples (i.e. each frame carries both the CC1/CC3 tuple and the CC2/CC4 tuple).

The MPV cc parser is pretty forgiving, as it just extracts the bytes as they arrive and feed them to be rendered during playback. But both VLC and ffmpeg fail to detect the presence of captions at all (probably because those apps properly expect them to be on every frame and they aren't). And definitely any broadcast quality hardware decoder is going to not play this content.

Here is more info on the cc count and padding, pretty interesting, I never dug into it.. The cc_count field dictates the number of three-byte tuples that are found in the frame. The framerate determines the appropriate cc_count (i.e. 20 for 29.97 and 10 for 59.94 FPS).

That said, it's not permitted to use all of the tuples within a given frame for CEA-608 packets. The standard only permits you to insert a fixed number per frame (2 CEA-608 tuples per frame for 29.97 and 1 CEA-608 tuple per frame for 59.94). The rest of the space is reserved for CEA-708, or for padding tuples if needed. Given you don't have any CEA-708 caption data, you should expect the following:

For 59.94 FPS, each frame should contain one CEA-608 tuple and nine padding tuples. For 29.97 FPS, each frame should contain two CEA-608 tuple and nineteen padding tuples.

Here's a quick example that I dumped out of a 720p59 broadcast feed I have here:

fc,80,80,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00

fd,80,80,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00

fc,ce,cb,ff,03,22,fe,4c,45,fe,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00

fd,4a,45,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00

fc,54,c8,fd,80,80,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00

Note that it alternates between 0xfc and 0xfd with each frame (0xfc is CC1/CC3, 0xfd is CC2/CC4), and that there are never more than a single 608 packet within a given frame. Also note that the cc_count is always 10 even if there is no caption data to be rendered at a given moment. Padding was inserted as needed to ensure the cc_count is constant (those are the 0xfa,0x00,0x00 tuples). Seeing the example he give, seems each line is a frame, each frame must have even 10 tuples (fd,80,80), 9 being padding tuples (fa,00,00). So with this line it seems there is 7 padding and 1x CEA-608 tuples and 2x CEA-708.. Only the fc and fd are 608 captions data and the other 2 are 708 data.. fc,ce,cb,ff,03,22,fe,4c,45,fe,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00,fa,00,00 Sounds like libcaption is working, it is on OBS.. Here is what he says on that.. I think libcaption is doing it's job - given a string of text it's producing a series of CEA-608 byte pairs that are the converted output. It's the responsibility of OBS to insert those byte pairs into the MPEG-TS stream at the proper rate and do any padding necessary.