lisamelton / other_video_transcoding

Other tools to transcode videos.
MIT License
543 stars 24 forks source link

other_transcode and forced subtitles #148

Open muesic opened 2 years ago

muesic commented 2 years ago

I have an mkv file ripped from a DVD that contains all 8 original subtitle tracks. If I play that mkv file in VLC then I can see all 8 subtitle tracks - only one of which (track 1) is the English movie subtitle track. However, and this continues to be an annoying VLC bug to this day, if I select subtitle track 1 it shows subtitles for that track all the time - for BOTH English and foreign speech i.e. VLC is unable to only show the forced subtitles.

I now use that same source mkv file as the input for HandBrake. In the Subtitles tab I select Foreign Audio Search and check both the Forced and Default boxes. The resultant compressed movie nicely contains only ONE subtitle track (selected to play by default) and when viewing the movie in VLC it shows subtitles ONLY where the foreign speech segments are spoken. Perfect! (aside from the fact that the video compression isn't as good as other_transcode).

Now if I use that same source mkv file as the input to other_transcode: other-transcode --hevc --crop auto --target 480p=1500 --add-subtitle auto filename.mkv then I end up with NO subtitle tracks at all :-(.

Then I tried: other-transcode --hevc --crop auto --target 480p=1500 --add-subtitle 1=forced filename.mkv and I ended up with one subtitle track (English) but it showed all the time i.e. even during the English bits.

Am I doing something wrong? I'm after the same good subtitle output that HandBrake makes. For all the conversions I need to do there is no way for me to figure out which subtitle track is the right one (so I need to specify the auto option in other_transcode), but HandBrake can figure that out. My eventual intent is to make a script and throw thousands of files at it.

Here is a Dropbox link to the source file if you're interested: https://www.dropbox.com/s/hncmlbf2tj53dml/my_movie.mkv?dl=0

lisamelton commented 2 years ago

@muesic Thank you for including a link to the source! And I've examined it carefully with MediaInfo and mpv. That first subtitle contains 372 elements. The second subtitle, also in English, contains 454 elements. So, it's obvious that the first subtitle is the full English version and the second subtitle is the SDH English version, which is why it has more elements. Those extra elements include the lyrics to some music and descriptions of various sound effects. There is no forced-only subtitle track in that source file. Nada.

Because you've ripped several episodes of the series as one file, it's entirely possible that MakeMKV has simply combined or ignored the actual forced-only subtitle required in one of those episodes. Most of the episodes don't actually require one.

I have no idea what HandBrake is doing. Perhaps there's some additional metadata that it's using from the stream to extract only the non-English elements? As an experiment, you could try re-transcoding the output from other-transcode to determine if HandBrake treats that just like the original. I suspect it might.

But other-transcode is behaving correctly because the subtitle you're adding is exactly like the original.

I'm sorry I can't be more helpful.

muesic commented 2 years ago

I just thought I'd clarify that I didn't rip several episodes of the series as one file - what you have is just one episode. I created the mkv file from a single episode DVD rip (i.e. VOB structure) using MakeMKV and selecting all available tracks.

Regarding your comment "Perhaps there's some additional metadata that it's using from the stream to extract only the non-English elements" - I kind of thought that's the way things have always been. If I play the original ripped TV episode (VOB structure) in VLC it plays with no subtitles at all unless I select track 1 in which case it plays the subtitles all the time (English and Spanish). I've complained to them about this bug many times, but the VLC crew can very little about VOB and, as near as I can tell, will never fix it. When I play exactly the same VOB ripped episode in Apple's DVD Player I don't have to do anything at all - it just automatically selects only the Spanish subtitles i.e. I reckon there MUST be metadata attached to ONLY the Spanish subtitles on the English track 1 (and it's been that way forever). I believe the mkv file you downloaded retained that metadata.

So HandBrake apparently knows about this metadata and I'm guessing therefore ffmpeg does too. Now the easiest thing would be to just output the ffmpeg command line from HandBrake, however years ago it dropped that feature :-(.

I'd just use HandBrake, but I get all sorts of video banding effects in dimly lit scenes using all of their video compression buttons & dials etc. I do NOT get that with other-transcode (which is great and why I want to use it!) but it's not robust with subtitles (which is bad). Curiously, I would have to think you have been bitten by this in all of the DVD/Blu-ray transcoding you've done (or maybe you don't know you have because you haven't rewatched everything or noticed it)...

I will try your experiment and get back to you shortly.

lisamelton commented 2 years ago

@muesic I strongly recommend that you do not ever rip multiple TV episodes into a single MKV file. Rip the episodes individually. Otherwise MakeMKV may combine tracks or do other weird things. This is because there is no guarantee whatsoever that individual episodes will contain the same track layout.

muesic commented 2 years ago

Reiterating the first sentence of my second post: I did NOT rip multiple TV episodes into a single MKV file - I only ripped one episode into an mkv! The file you downloaded is just one uncompressed episode.

So onto your experiment, I took the file output from the second other-transcode command line in my original post (the only one that actually yielded a subtitle track, but it disappointingly showed all the time) and I reconverted it using HandBrake with the options also mentioned in my first post. Interestingly, the HandBrake output still shows subtitles for only the Spanish parts, which I take as confirmation that there is metadata associated with just only those subtitles and that HandBrake/ffmpeg can detect them and output only them. I'm hopeful you might consider adding this feature to other-transcode given (I believe) your work is a front end to ffmpeg.

Going back to my first post, do you know why my first command: other-transcode --hevc --crop auto --target 480p=1500 --add-subtitle auto filename.mkv didn't yield any subtitles at all? When I first tried it, I was going by the documentation that described the option as: "enable automatic addition of forced subtitle" which I took to mean as doing the same thing HandBrake.

If I can get these subtitles working I'd be over the moon being able to use your video algorithms :-)

lisamelton commented 2 years ago

@muesic My mistake, you're correct, that's a single episode. My apologies.

lisamelton commented 2 years ago

@muesic As to getting only the Spanish language portion of that subtitle, I don't know what to tell you. HandBrake must be doing something that FFmpeg is, AFAIK, not capable of.

muesic commented 2 years ago

Hmm - I'll see what I can do about asking the ffmpeg developers (they don't have a forum). Are you in close contact with those guys?

So I guess my only remaining questions for now are:

  1. why didn't the --add-subtitle auto option yield any subtitles at all?
  2. Using your own tools, it stands to reason your DVD/Blu-ray compressions must have lost subtitles for shows like the one I have provided - or did you resolve that issue some other way?
lisamelton commented 2 years ago

@muesic No, I don't have any contacts with either the HandBrake or FFmpeg teams. That's intentional so I can maintain my sanity. :)

The --add-subtitle auto option and argument didn't add any subtitles because no subtitle track within your source has its "forced" flag set. That's how the code decides which track to automatically select.

Honestly, your issue here is the first time I've ever encountered this problem. I and my contributors in the HiveMind™ have never "lost" subtitles this way. Perhaps this is unique to the mastering of Better Call Saul on DVD. I don't know.

muesic commented 2 years ago

I once spent months and months compressing everything I owned only to find out afterwards that many of my videos showed that awful banding that other-transcode nicely avoids, and also that I lost very many forced subtitles. I only found that out by rewatching some stuff, realizing something was wrong and then ending up where I am now - looking down the barrel of reconverting everything all over again :-/.

I can assure you that Better Call Saul is not alone in handling it's subtitles this way - in fact I have many that do the same thing. That's why I thought you surely must have "corrupted" compressed files after compressing all of your library.

On a separate note, I searched the "ffmpeg -h full" text for potential subtitle commands/options and dredged up the following that don't show up in the usual "ffmpeg -h long" text:

dvdsubdec AVOptions: **** -forced_subs_only .D...S..... Only show forced subtitles (def=false)

PGS subtitle decoder AVOptions: **** -forced_subs_only .D...S..... Only show forced subtitles (def=false)

OK - I'm being called for dinner. Gotta run :-)

lisamelton commented 2 years ago

@muesic You could try using --dry-run to get the actual ffmpeg command that other-transcode generates, copy it and then insert -forced_subs_only into that string in the appropriate place.

I'll try it myself later.

lisamelton commented 2 years ago

@muesic BTW, MakeMKV is able to extract forced subtitles from "full" subtitle tracks. It does this by default for Blu-ray rips. I'm not sure about DVD rips since I rarely do those anymore.

muesic commented 2 years ago

This is just a test. When I previously pasted in those additional ffmpeg options some key text vanished. This is just to see if it does it again in case somehow I screwed things up the first time:

(The following lines have been edited to include the boolean arguement) dvdsubdec AVOptions: **** -forced_subs_only \<boolean> .D...S..... Only show forced subtitles (def=false)

PGS subtitle decoder AVOptions: **** -forced_subs_only \<boolean> .D...S..... Only show forced subtitles (def=false)

muesic commented 2 years ago

Well, looks like pasting text into this window where I type my comments gets parsed and edited somehow. After each -forced_subs_only word there is supposed to be a boolean flag indicator " " i.e. if you wanted to use that option you'd actually type: -forced_subs_only true Let's see if my typing makes it's way through this time.

muesic commented 2 years ago

Hmm - every time I type "<bo" + "ol>" i.e. the word "bool" in angle brackets, it gets deleted

lisamelton commented 2 years ago

@muesic That's because <bool> is being interpreted as an HTML tag. You need to use Markdown to "escape" it as code.

muesic commented 2 years ago

I just tried the experiment to add the "-forced_subs_only true" into the ffmpeg command line and it did NOT yield only the subtitles only during the Spanish parts - they were always there even during the English parts :-(.

So we're back to trying to figure out how to get "forced" English subtitles. I'll pick this up tomorrow...

samhutchins commented 2 years ago

You can use mkvextract to get the subtitle file: mkvextract.exe .\my_movie.mkv tracks 5:subtitle.sub

You can then load the 2 files it makes (.sub and .idx) into BdSup2Sub: https://github.com/mjuhasz/BDSup2Sub

From there, click OK on the Conversion Options dialog that comes up, and in the log window at the bottom you should see "detected 30 forced captions", or something. The number will change depending on the source

Go to File -> Export, make sure "Export only forced" is ticked, then save it somwhere

You can then use mkvmerge (or mkvtoolnixgui, if you prefer a gui) to mux those subtitles back into the video file, and set the global forced flag on that track

lisamelton commented 2 years ago

@samhutchins Thanks! And if that proves to complicated then perhaps @muesic can try re-ripping one of the problem episodes with MakeMKV and make sure the option is set to extract forced subtitles for that track.

muesic commented 2 years ago

Thanks very much Sam. While I'm sure what you describe will likely work, as Don also suspected, it is quite complicated - especially after being multiplied thousands of times in a script.

I've been making some further headway in the ffmpeg domain. Ideally I would have liked to get the ball rolling using: other-transcode --dry-run --hevc --crop auto --target 480p=1500 --add-subtitle auto ../my_movie.mkv However that doesn't yield any subtitles in the output at all! Don says it's because none of my subtitles have the forced flag set (more on that in a moment). Now I can never be sure which specific subtitle NUMBER contains the right English subtitles, so I have no choice now but to start with: other-transcode --dry-run --hevc --crop auto --target 480p=1500 --add-subtitle eng ../my_movie.mkv

That yields:

ffmpeg -loglevel error -stats -i ../my_movie.mkv -vsync cfr -map 0:0 -c:v hevc_videotoolbox -b:v 1500k -color_primaries:v smpte170m -color_trc:v bt709 -colorspace:v smpte170m -metadata:s:v title\= -disposition:v default -map 0:1 -c:a:0 copy -metadata:s:a:0 title\= -disposition:a:0 default -map 0:5 -c:s:0 copy -disposition:s:0 0 -map 0:6 -c:s:1 copy -disposition:s:1 0 -map 0:10 -c:s:2 copy -disposition:s:2 0 -metadata:g title\= -default_mode passthrough my_movie.mkv

i.e. it found THREE English subtitle tracks, the relevant ffmpeg parts are: -map 0:5 -c:s:0 copy -disposition:s:0 0 (the actual English subtitle track) -map 0:6 -c:s:1 copy -disposition:s:1 0 (the English SDH subtitle track) -map 0:10 -c:s:2 copy -disposition:s:2 0 (an English commentary subtitle track) Of those three, only the first contains the forced Spanish subtitles that are desired to properly view the episode.

So here's what I've found that makes ffmpeg do the right thing so that ONLY the Spanish subtitles:

  1. In the first part of the ffmpeg command add "-forced_subs_only 1 "
  2. Replace the "copy" keyword with "dvd_subtitle"

This modifies the ffmpeg command line to:

ffmpeg -loglevel error -stats -forced_subs_only 1 -i my_movie.mkv -vsync cfr -map 0:0 -c:v hevc_videotoolbox -b:v 1500k -color_primaries:v smpte170m -color_trc:v bt709 -colorspace:v smpte170m -metadata:s:v title\= -disposition:v default -map 0:1 -c:a:0 copy -metadata:s:a:0 title\= -disposition:a:0 default -map 0:5 -c:s:0 dvd_subtitle -disposition:s:0 0 -map 0:6 -c:s:1 dvd_subtitle -disposition:s:1 0 -map 0:10 -c:s:2 dvd_subtitle -disposition:s:2 0 -metadata:g title\= -default_mode passthrough my_movie.mkv

When that runs you end up with a movie that has 3 subtitle tracks. The first (the most important) now ONLY contains subtitles for the Spanish spoken parts. The other two contain nothing (presumably because the filter that only kept the forced subtitles threw everything in those two tracks away)

So we're very close! Either we need a way to ditch empty subtitle tracks, or we need to better understand why other-transcode's "--add-subtitle auto" failed when clearly there was a subtitle track that contained forced subtitles. I have to go shortly, but what I do see at the start of many lengthy ffmpeg commands is the following:

-probesize 50M -analyzeduration 100M

It's my understanding that it allows ffmpeg to scan the input a little deeper to actually find those forced subtitle tracks, and I'm hoping that if Don tried this that maybe his "--add-subtitle auto" yields a positive result instead of nothing. OK, my wife is brow-beating me to get out of here. I haven't proof read this, but hopefully it makes sense. I'll be back in 5-6 hours.

lisamelton commented 2 years ago

@muesic As I mentioned before:

The --add-subtitle auto option and argument didn't add any subtitles because no subtitle track within your source has its "forced" flag set. That's how the code decides which track to automatically select.

So, you need to set the "forced" flag on the original track. You can do that with mkvpropedit.

Adding -forced_subs_only 1 and -c:s:0 dvd_subtitle won't change the behavior of --add-subtitle auto. However, that might be a useful mode to add in the future for other reasons.

muesic commented 2 years ago

Thanks for considering to add the new options for forced subtitles - that would be awesome!

Regarding the "--add-subtitle auto" option, I kind of feel like I'm treading on egg shells by belaboring this.

The following is what I think to be the case. There are two dominant ways DVD are authored with forced subtitles. The first is to just have a fully dedicated forced subtitle track that contains the English subtitles only when foreign languages are spoken. You can simply give that entire track a forced flag and you're done. The second way is to have a full English subtitle track for ALL spoken words BUT put a special forced flag (or metadata tag or whatever) on only the parts where a foreign language is spoken so those parts can then be separately extracted as forced subtitles if desired. The example my_movie.mkv file I gave you is the latter. Your "--add-subtitle auto" almost certainly works for the first case, but it's apparently not useful, or intended for the second case and I'm hoping there's a remedy for that with some additional ffmpeg magic.

When selecting a forced subtitle track in HandBrake, you can select the "Foreign Audio Search" option and also check the Forced Only checkbox. HandBrake knows your preferred language is English and it's being instructed to traverse the English subtitle tracks and discover the existence of a "forced" subtitle somewhere within them. HandBrake can automatically identify the one qualifying subtitle track for mkv file I gave you, but other-transcode can't find any using "--add-subtitle auto". Because HandBrake can do it, I'm suspicious that ffmpeg can do it so I'll look further into that and get back to you. I'm hopeful "--add-subtitle auto" can be tweaked to live up to it's full potential :-). Also, it stands to reason that HandBrake's "Foreign Audio Search" technique should correctly identify the right subtitle track for both types of authored dvds.

Anyway, I'll see what I can find out tomorrow.