CCExtractor / ccextractor

CCExtractor - Official version maintained by the core team
https://www.ccextractor.org
GNU General Public License v2.0
705 stars 421 forks source link

CCExtractor won't extract subtitles from TS with no PAT/PMT #805

Open cfsmp3 opened 6 years ago

cfsmp3 commented 6 years ago

[forwarded from a user email, resposted with permission]

The company Channel Master has produced a video recorder named DVR+. It records over-the-air television broadcasts in the USA. It has some internet channel capability, but does not record cable or satellite broadcasts. It is gaining in popularity in the U.S. market.

It records a single program - demuxing it from a multi-program channel stream - and saving it as a *.ts file. The recording appears to be complete for the audio and video streams. However, it does not record a PAT/PMT in the file. MediaInfo indicates that the closed captions are present in the video stream. The DVR+ does display closed captions upon file playback. However, it is uncertain if these are CEA-608 digitally encoded or CEA-708 types. Indications are that they are displaying the CEA-708 style and calling them digital CC1 thru CC4 in the system menu.

I am using ccextractor version 0.85 compiled in Linux using the Makefile without the OCR option. It successfully extracts the captions from your website sample files. However, it does not extract any closed captions from the DVR+ recordings. I have unsuccessfully used multiple combinations of command line switches, including -datapid and -haup as well as others. I used Avidemux video editing software to copy both the video and audio streams to a new *.ts file. Avidemux changes the stream id numbers and adds a PAT/PMT to the file. It does not appear to make any other changes to the file. Now when using ccextractor with the -svc switch I obtain both CEA-608 and CEA-708 style captions. It has been reported on the internet that the video editing program VideoPro has the same results.

I have attached three original DVR+ recordings and the Avidemux modified versions [copied here]:

https://drive.google.com/drive/folders/1RxXtp8gBiRfOuCysy9A1wTKeYsST-Bgs

As you can see from the above examples, the television and ccextractor appear to agree on the caption streams. The DVR+ is recording all caption types but appears to only display the CEA-708 captions calling them digital CC1-4. Looking at the *.ts files with MediaInfo gives multiple caption streams, which may be real but the broadcast program is sending an empty place holder -or- may simply be an issue with MediaInfo interpretation of the streams.

What I am looking at is why ccextractor will not work with the original files or does it need a PAT/PMT? If it does, then should not the -datapid switch when given the pid of the video stream bypass the PAT/PMT and directly access the captions? Can ccextractor be modified to work with the original files, or do we need to provide a technique to add a PAT/PMT to the file with an external program?

If you want better information on this equipment or files, there are branches for the DVR+ and software development by users on the AVS Forum website. Search the internet for DVR+ Lister which will give you a forum that is monitored by several knowledgeable people that are developing software to interact with the DVR+ and save the stored programs.

cfsmp3 commented 6 years ago

Related: https://github.com/CCExtractor/ccextractor/issues/713

Adityak9 commented 5 years ago

@cfsmp3 i am able to extract subtitles from the dracula.ts and stanley.ts file linked above. Haven't tried with football.ts. But with mismatch in timmings Builded using cmake with ocr, hardsubx and ffmpeg on. screenshot from 2019-02-09 10-57-03

cfsmp3 commented 5 years ago

What about without ffmpeg?

On Fri, Feb 8, 2019 at 9:29 PM Aditya Kumar Singh notifications@github.com wrote:

@cfsmp3 i am able to extract subtitles from the dracula.ts and stanley.ts file linked above. Haven't tried with football.ts. Builded using cmake with ocr, hardsubx and ffmpeg on.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Adityak9 commented 5 years ago

What about without ffmpeg? On Fri, Feb 8, 2019 at 9:29 PM Aditya Kumar Singh @.***> wrote: @cfsmp3 i am able to extract subtitles from the dracula.ts and stanley.ts file linked above. Haven't tried with football.ts. Builded using cmake with ocr, hardsubx and ffmpeg on. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Not working without ffmpeg. No captions found in input.

cfsmp3 commented 5 years ago

OK, so the job is clear :-)

On Mon, Feb 11, 2019 at 9:52 PM Aditya Kumar Singh notifications@github.com wrote:

What about without ffmpeg? … <#m4539917215944178917> On Fri, Feb 8, 2019 at 9:29 PM Aditya Kumar Singh @.***> wrote: @cfsmp3 https://github.com/cfsmp3 i am able to extract subtitles from the dracula.ts and stanley.ts file linked above. Haven't tried with football.ts. Builded using cmake with ocr, hardsubx and ffmpeg on. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Not working without ffmpeg. No captions found in input.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCExtractor/ccextractor/issues/805#issuecomment-462626094, or mute the thread https://github.com/notifications/unsubscribe-auth/AFrJ2Tpd2l3fBSAbD8DpD6mqRj83--7zks5vMlaOgaJpZM4Qcdnf .