CCExtractor / ccextractor

CCExtractor - Official version maintained by the core team
https://www.ccextractor.org
GNU General Public License v2.0
722 stars 427 forks source link

[BUG] Odd output when inputting an MKV file #1371

Open Southpaw1496 opened 3 years ago

Southpaw1496 commented 3 years ago

CCExtractor version:

CCExtractor detailed version info
    Version: 0.91
    Git commit: Unknown
    Compilation date: 2021-07-26
    File SHA256: Could not open file
Libraries used by CCExtractor
    Tesseract Version: 4.1.1
    Leptonica Version: leptonica-1.81.1
    libGPAC Version: 1.0.1
    zlib: 1.2.11
    utf8proc Version: 2.6.1
    protobuf-c Version: 1.4.0
    libpng Version: 1.6.37
    FreeType 
    libhash
    nuklear
    libzvbi

Necessary information

Video links

Additional information

MKV was ripped from a disk using MakeMKV. I've included the output of running ccextreactor [filename] in the below output.zip file. Output.zip Please let me know if any more information is required.

sheharyaar commented 3 years ago

Hi @Southpaw1496, I would like to look into the issue. Can you please send the password to the archive at : sheharyaar48@gmail.com

Southpaw1496 commented 2 years ago

Here is another file of a Bugs Bunny short, unencrypted this time https://drive.google.com/file/d/1cmntXqJFZGRdoNGLljPBYqFBgeckJlO7/view?usp=sharing

Output for this file: Output-bugs.zip

canihavesomecoffee commented 2 years ago

Bugs file has been made available here: https://sampleplatform.ccextractor.org/sample/178

Logs from output zip above:

CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: bugs.mkv
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: bugs.mkv
File seems to be a Matroska/WebM container
Analyzing data in Matroska mode

Document type: matroska
Timecode scale: 1000000
Muxing app: libmakemkv v1.16.4 (1.3.10/1.5.2) darwin(x64-release)
Writing app: MakeMKV v1.16.4 darwin(x64-release)

Track entry:
    Track number: 1
    UID: 1
    Type: video
    Codec ID: V_MPEG2

Track entry:
    Track number: 2
    UID: 2
    Type: audio
    Codec ID: A_AC3
    Language: eng
    Name: Stereo

Track entry:
    Track number: 3
    UID: 3
    Type: subtitle
    Codec ID: S_VOBSUB
    Language: eng

Track entry:
    Track number: 4
    UID: 4
    Type: subtitle
    Codec ID: S_VOBSUB
    Language: eng
 99%  |  06:50
100%  |  06:50
Output file: bugs_eng.(null)
Output file: bugs_eng_1.(null)

Found no AVC track. 

Total frames time:    00:00:00:000  (0 frames at 29.97fps)
Done, processing time = 1 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

It creates an empty .srt, and the two files for the VOBSUB ones (albeit with a "(null)" extension?), but no conteint is in either file.

PunitLodha commented 2 years ago

So, the issue here is that we don't support VOBSUB subtitles. To support it, we need to create 2 files, .idx and .sub. We generate .idx file (although incorrect), but no .sub file.

Current .idx file:-

# VobSub index file, v7 (do not modify this line!)
Headers...

+ timestamp: 00:00:01:101, filepos: 000000000
+ timestamp: 00:00:08:708, filepos: 000001000

We also need to write correct data to the .sub file.

Reference,