mdhiggins / sickbeard_mp4_automator

Automatically convert video files to a standardized format with metadata tagging to create a beautiful and uniform media library

MIT License

1.51k stars 201 forks source link

AAC and DTS audio and couple other questions. #190

Closed cayars closed 8 years ago

cayars commented 9 years ago

This is very nice indeed. I have a couple of questions. 1) Is it possible to keep ALL English sound tracks already present in the input file but only create an AAC/256 audio track if not present? The main reason is to keep DTS intact. Ideally the script would do a COPY (regardless of bitrate) of all tracks and if one isn't AAC at the default bitrate (ie 256) then create that AAC track first (then copy each additional audio track).

2) Is it possible to strip out any subtitles if they are present in the source file?

3) Have you given any thought to using HandBrake CLI or making it an option when NOT doing a remux? I've found HB to be faster/better quality than ffmpeg conversions. Also depending on hardware the use of HB could allow much faster transcoding with the use of Intel QuickSync.

Thanks, Carlo

mdhiggins commented 9 years ago

I don't 100% know if the exact behavior you're describing is possible, mainly because there are extra variables that I think you aren't considering. Its impossible for the script to be able to tell if a specific audio track is commentary or some extra feature vs the actual audio track reliably, so a lot of cases there are multiple audio tracks, some surround sound, then the stereo AAC track is just commentary. I think the closest thing to what you'd want behavior wise here is to enable the iOS audio option, then add DTS or any other codecs to your audio-codecs option so they will be copied instead of converted. This will force the creation of of the AAC audio tracks for any audio track that isn't already AAC stereo, and will leave the remaining audio tracks untouched.
Simply set your subtitle languages to 'nil' and none will be carried over. Additionally there are options to externalize any embedded subtitles into external .srt files if you're more interested in that. Check the readme.
No plans for handbrake any time in the future. The lack of ability to remux video/audio tracks that are in the correct format was a deal breaker when the script was started. Most MKV files found today are already in H264 and no matter how good handbrake was at converting, directly copying the video track with no conversion was always faster, more reliably, and had no quality loss. At this point everything so tied to FFMPEG that it won't likely change. However if anyone wanted to fork the script and adapt it for handbrake it would be possible and probably not incredibly difficult for someone who is experienced with handbrake CLI.

cayars commented 9 years ago

That was quick! 1) Actually from reading how this works I think I found a bug. The "test" file I was trying that was DTS was mono 768 track. I don't think I've seen this before. Sort of defeats the purpose of using DTS but this was an only classic released on blu ray. I tried a different video and it seems to be working correct.

However, regardless of mono/stereo/bitrate anything other than AAC should probably just be copied if it's on the "audio-codec" line.

2) Thanks, that worked great

3) Yes I completely agree over control in regards to remux etc. I've been d/ling some classic tv shows that I can only find in AVI format (not h264) and I'm just used to using HB. Doesn't mean I can't/won't switch. I could just process anything like AVI one way and MKV/MP4 a different way which would be rather simple to setup.

Question: With the default you use for ffmpeg how would the quality compare to HB if you know? Would it be like the "normal" or the "high profile" settings in HB?

Another reason I asked about HB is that I do something rather unique for my Plex system. I share with a LOT of people and a good portion of them have limits of 720/3mbit in Rokus or Now TV boxes. Since the large portion of my library is 1080p this causes a lot of transcoding for the CPU for these players. So what I do is create the HIGH output file as normal along with a 2nd file that is 720P and max bitrate of 3mb. The combination of these two files will keep Plex from transcoding on the fly 95%+ of the time. I don't have my whole library setup this way. I just keep X number of 720 files around for the newest X releases (what most people play) and some always popular blockbuster films.

When I create these 2nd files I use HB with the QuickSync setting which is a lot faster!!! I don't worry about quality of QS as these files won't be around forever as they are just part of X cache files. Anyway you cut it the HB QS created files are way better QUALITY wise than what Plex will do in real time.

But again I can do this outside of this script. Just thought I'd share why in case it gives anyone any ideas for their own systems.

cayars commented 9 years ago

Hmm, doesn't seem to be working correctly. I have the following in the ini file: audio-codec=aac,ac3,mp3,dts ios-audio=True max-audio-channels= audio-language=eng audio-default-language=eng audio-channel-bitrate=256

I start out with a file that contains an MP3 128kbps 2 channel audio file and end up with this: AAC 434kbps 2 Channel audio file No Mp3 audio was copied.

So why did I not end up with 2 audio files and why was the AAC created with more than 256kbps?

Do I have the options wrong? I want the audio streams copied over regardless of bitrate so should the audi-channel-bitrate be set empty as the ios-audio setting of true should create the AAC with 256kbps stream?

cayars commented 9 years ago

Using same settings as above with a file containing only a English DTS 1536kbs 6 channel audio file I end up with: English AAC 255kbps 2 channel English AAC 832kbsp 6 channel

mdhiggins commented 9 years ago

The default behavior of the script is to create as few audio channels as possible while still maintaining maximum compatibility with all devices. With the iOS audio option enabled if a stereo audio channel is detected as the source it will not copy the original audio track since then you'd be left with 2 stereo audio channels that are indistinguishable from each other in theory. In this unique scenario, the script forgoes the default 256 kbps for the iOS channel and instead obeys the other settings (max channels * audio bitrate) in an attempt to not compromise quality since you are losing your primary audio source channel (in this case the mp3). This only happens in this particular scenario where 1 stereo audio source is encountered with the iOS audio option enabled. Setting your audio-bitrate to 128 (instead of the default 256) will get the bitrate down.

mdhiggins commented 9 years ago

And your problem with the DTS 6 channel audio is that FFMPEG recognizes DTS as dca. Just change dts to dca in your options.

That's my bad, forgot about that.

cayars commented 9 years ago

There is weird stuff going on and I tracked it down to class MkvtoMp4 As you stated it does create an AAC audio channel for each audio track. This isn't needed for IOS as only the 1st track needs this conversion. So a simple flag set that says the IOS track has been done once will stop this from happening again as only the 1st track needs to be AAC.

So if first track != AAC then create AAC track. Copy all other tracks if permitted

But there are several bugs in the logic in the audio section of this object. The biggest thing is that it doesn't "copy" the existing audio track but instead trys to recalculate the rates for some files.

Audio channel adjustments

            if self.maxchannels and a.audio_channels > self.maxchannels:
                audio_channels = self.maxchannels
                if acodec == 'copy':
                    acodec = self.audio_codec[0]
                abitrate = self.maxchannels \* self.audio_bitrate
            else:
                audio_channels = a.audio_channels
                abitrate = a.audio_channels \* self.audio_bitrate

the else clause

If you start with a Mp3 1286kbit you end up with 512kbit AAC which is 2 channel * 256. This obviously won't work on IOS.

As far as processing time: I'm pretty sure I could speed up the conversation by 200% for the files I'm testing. I don't want subtitles and have it left blank. However the script first wants to pull 3 different subtitles from the MKV file. If this option is blank then there should be no need to do this as the subtitles aren't being used. (unless this is needed for a different feature?). If so maybe add a flag to ignore subtitles all together.

Next it's creating an extra audio track that isn't needed nor wanted.

Third after remuxing the file it then re-orders it (MOOV). This step can take a while as it's basically coping the whole file again. You can pass a command line (-movflags faststart ) to ffmpeg and build it correctly from the start and avoid this costly step.

mdhiggins commented 9 years ago

I'll try to address all of these issues here one by one since you hit me with a lot:

Assuming only the first audio track needs the conversion is a bold assumption. It may work for some use cases but is definitely not ideal for all use cases. The purpose of creating this audio track is so that that particular track can be natively played on mobile devices (its poorly named but the name has stuck). I agree looking at the code that I should force the bitrate to 256 for consistency sake, but AAC stereo 512 does play natively just fine. If you only do the first audio track then you won't be able to play back any of the other included audio tracks on a mobile platform (this applies to things like the Chromecast which really only play nice with AAC stereo or AC3 5.1 without needing on the fly transcoding).

As far as the subtitle issue, I'm not sure I follow 100% what you're getting at here. The script will copy any subtitles from the source material that match the language criteria set in the autoProcess.ini. If you don't want to copy any subtitles setting your subtitle language to 'nil' will have it ignore all subtitles. Leaving the subtitle option blank has it blanket copy everything. You can argue which default behavior would be better for more users but the option is there to ignore all subtitles. Some people simply want everything copied from their source files while others want to strip out as much as possible, but I implemented both options to cover both scenarios.

Finally the MOOV atom. If you dug through some of the old commits you would see I actually tried implementing this when FFMPEG added that faststart flag. There's 2 issues with this. The first is simply your assumption that it speeds anything up is wrong. FFMPEG does the full conversion and moves the information around just like the script currently does, except its all done within FFMPEG instead of manually. The conversion time actually almost doubles with this flag enabled, so no real time is saved. The other bigger issue, and the reason I had to revert back to using QTFaststart, is that on a certain percentage of large files writing the metadata tags would break playback if the MOOV atom was already moved to the front of the file (likely due to running out of space for this data). So the solution is to make the mp4 file, write the tags at the end where space isn't an issue, then move everything around with all the pieces in place so the size can be properly calculated. It ultimately ends up taking almost the exact same amount of time.

Update: Latest commit forces 256 bitrate (or 128 for a mono channel) in the event of a stereo or mono audio source with the iOS-audio option enabled.

cayars commented 9 years ago

I agree it may be bold but I can only think of one situation where you would want to duplicate each and every track as AAC and that would be for old devices that can only play AAC and this only makes sense for commentaries. Most mobile devices these days play more than just AAC.

So I was being "narrow minded" and only thinking of my own use of course. For me and for most Plex users I think they would want something more like how Handbrake does it. In a nutshell it will copy all tracks of your choice but will add an AAC track as the first just to have a failsafe track that will play on anything. Only making an AAC copy of the first true track (never seen commentary as first track) is normally a safe bet.

Yes it should be forced to 256 UNLESS the first real track is less than that.

I think I misunderstood what you were saying for the subtitles. I "read" nil as null as in leave it blank or empty. I never saw a program use "nil" as a value before. Empty values are normally treated like null while a value of "all" or similar is usually used to express the intent to use them all. Neither here nor there so I'll try 'nil'. :)

As for the MOOV, what you explained isn't what I experience when using ffmpeg for mkv to mp4 converstions. But lets just let this rest. I'll try and play with it and see if I get any difference between the two method with changes to the code. If so I'll open a new ticket or shoot you an email. This one isn't that important as it's only a minor speed difference at best and doesn't change any functionality.

I appreciate you making the code change for 256/128 which was quick.

Please also keep in mind I'm "nit-picking" functionality and can make the changes I need/want for myself. This is a truely great set of python scripts and is super useful for the community so KUDOS to you for that.

With that said. :) Would you consider adding a new option like FirstTrackAacOnly? It could be set to FALSE and have the same functionality it does now and convert/create a new track to an AAC or if set to TRUE would only create one AAC as the first track and then just copy any additional tracks that are listed in in the audio-codec section (what I've described). This would be fairly trivial from a code standpoint but would open this script up to even more people that currently like/want the functionality/method that handbrake does which is to copy all audio tracks (that are choosen) but make the audio compatible with a stereo track that works for any device.

Again THANK YOU as this is a fantastic set of scripts, Carlo

mdhiggins commented 9 years ago

Hey

I appreciate all the feedback sorry if I came off in a way that suggested I didn't. I just like to explain my reasoning before making any changes because a lot of people have different ideal setups. That being said, I definitely agree that there should be an option so that only the first track gets dual audio streams. Rather than make a FirstTrackAacOnly option, I've gone ahead and added an additional option called ios-first-track-only which is tied to the ios-audio option. With the ios-first-track-only set to True the iOS-audio option will only be applied to the first audio track in the source file before being switched off. This allows the flexibility of the iOS audio option (which can actually be modified to via autoProcess.ini to use any codec, not just aac, the default) without having a whole bunch of redundant code. As of the latest commit (4e6f3443950b664d23e3ebb1d1deda67f45aa15f) this feature is added but undocumented. If you look I think I was able to make the changes here rather elegantly with only a few lines. If you could test this on your files and let me know if its behaving how you'd expect then I'll add it to the readme as an official feature.

As far as the MOOV atom goes, I misspoke when I said doubling the time, as this is really tied to how long the overall conversion takes and how fast the drive is your working on. On my SSD it really adds very little time since you're essentially just moving data around, but on a regular HDD it takes much longer, and how long it takes compared to just the conversion depends on a lot of factors. That being said FFMPEG definitely does the exact same thing its just less obvious cause its all sort of invisible to the user, but either way its still not gonna work because of the tagging issue. If you can find a way around all this that would be excellent but I think its technically not possible due to how the MOOV atoms are written.

Anyway let me know how the first track option works for you, and how the 256/128 option is working as well. And I do appreciate your input I just like to present my reasoning before making any changes. But thanks I think this option makes the script more useful for a number of users.

Edit: Just to clarify, you should have both the ios-audio option and the ios-first-track-only option set to true to use this feature.

cayars commented 9 years ago

I totally get where you were coming from and why you explained it the way you did. I'm a programmer also as you probably guessed and had a "slightly" alternate way of using your stuff which is why I too went into detail also. :)

I'm off to download and run through the tests. I'll also test it with ios-first-track-only set to false to make sure it works as it did previous. Be back in probably an hour or so.

mdhiggins commented 9 years ago

Keep me posted

cayars commented 9 years ago

First file had a 128kbps MP3 which got changed to 256kbps AAC with no 2nd MP3 track. I understand the reason for not coping the MP3 as it's redundant so this was a success.

Second file started with: Italiano 768 DTS kbps 6 channel English AC3 640 kbps 6 channel OUTPUT: English AAC 256 kbps 6 channel English AC3 640 kbps 6 channel

Third file started with English DTS 768 1 Channel OUTPUT: English AAC 128 kbps 6 channel <-- problem Didn't get the English AC3 640 kbps 1 channel <-- problem So this file didn't get a 256 AAC but instead got 128 and the DTS channel didn't make it to the output file.

Forth file started with: Polish AC3 640 6 Channel English AC3 640 6 Channel English DTS 1536 6 Channel OUTPUT: English AAC 256 2 Channel English AC3 640 6 Channel English DTS 1536 6 Channel

I had to make a couple of changes if self.iOS: if a.audio_channels > 2 or a.audio_bitrate > 256 or a.codec != 'aac': <--- here print "Creating dual audio channels for iOS compatability for this stream" audio_settings.update({l: { 'map': a.index, 'codec': self.iOS, 'channels': 2, 'bitrate': 256, 'language': a.language, }}) l += 1 self.iOS = False <---here

            # If the iOS audio option is enabled and the source audio channel is only stereo, the additional iOS channel will be skipped and a single AAC 2.0 channel will be made regardless of codec preference to avoid multiple stereo channels
            if self.iOS and a.audio_channels <= 2 and a.audio_bitrate <= 256:  <--here
                acodec = 'copy' if a.codec == 'aac' else self.iOS
                audio_channels = a.audio_channels
                abitrate = 256 <--here
            else:

With those changes everything works as expected for the audio.

HOWEVER, I'm still having issues with the subtitles being pulled and stored on the drive. My setting look like this: subtitle-language=nil subtitle-default-language=nil

When it runs it shows: Ripping nil subtitle from file

I tried searching the code for 'nil' but couldn't find this anywhere so I'm not sure how it's supposed to skip this section if set to 'nil'. This is rather easy to work around so not to worried about it.

I got side tracked with winter storm prep as we are expecting about a foot of snow over the next 1.5 days. :( I'm just getting back to playing with this. I haven't had the chance to look yet but when doing a transcode is this using VBR or QR?

Thanks, Carlo

mdhiggins commented 9 years ago

First easy fix - make your default subtitle language blank. Setting it to nil sort of negates everything and will cause any undefined subtitles to be copied. Just leave that blank with the subtitle-language still set to nil.

With regards to the other issue, it looks like its a result of you still not allowing DTS to be copied. I just want to verify you added "dca" to your audio codecs before we troubleshoot any further

cayars commented 9 years ago

Yes, I did add "dca" and it worked for all >2 channel DTS files.

OK will try with the blank subtitle-default-language.

Formatting sucked but I did give you the code changes that fix the problem. I myself would rewrite it a different way. In other words for IOS, I would just check the first audio track to make sure it's AAC, 2 channel or lower and 256kbit or lower. However, I realize you may want to create AAC tracks for every track but it could be similar code.

The problem is caused because of the 768 DTS 1 CHANNEL track that slipped by the existing IF checks. If you look at my previous message each line changed/added is marked with "<--here" somewhere on the line.

Do you happen to know if this is using VBR or QR and what part of the code that could be "tinkered" with?

mdhiggins commented 9 years ago

The script is using VBR I believe, mainly because this is what FFMPEG defaults to. I had tried to setup a branch that used QR but had lots of problems with consistency I wasn't able to reliably predict what sort of different formats could be thrown at the script often resulting if huge disparities of quality so I ended up scrapping it.

I think the issue here is that I assumed if you were throwing 2 or less channel audio at the script that it would simply make the iOS track and scrap the source track because for mono or stereo audio at 128/256 bit for AAC there really wouldn't be any noticeable quality difference (at least in my opinion). It sounds like this behavior is what you don't want, and you want, if the source audio is a higher bitrate, that you'd want 2 mono audio channel tracks (or stereo), one created following the iOS (AAC in this case) parameters and another created following the main parameters (in this case copied as DTS). My hesitation to do it based on bitrate is because there bitrate varies quite a bit depending on which codec the source is, and in most situations where DTS isn't being copied you'd end up converting that same audio source twice, one for AAC via the iOS option and again for the primary options (AC3 by default) leaving you with 2 mono tracks.

That being said I think the best solution, rather than what you proposed, would be to change line 214 on mkvtomp4.py to if a.audio_channels > 2 or a.bitrate > 256

So currently with the following default settings:

ios-audio = true
audio-codec=ac3

and English DTS 768 1 Channel with your source, the expected output would be:

AAC 128 1 Channel

with your proposed changes, you'd get (again default settings)

AAC 128 1 Channel
AC3 256 1 Channel

With your settings:

ios-audio = true
audio-codec=aac,dca

you'd get, with the same input:

AAC 128 1 Channel
DTS 768 1 Channel (copied)

cayars commented 9 years ago

I had a couple free hours this morning so I played around with the code for a bit. I've got the audio working quite well and can have it look for the best audio track to use to create the 2 channel AAC. I've also added sample rate settings since the defaults are pretty low. I'd think you would want to at least use 48,000 minimum. I added support for FAAC since it's better than the conventional built in encoder and also offers VBR audio in addition to CR.

I personally don't ever want to remove an English track since I might want to go back and re-encode the AAC later again based on some new code change. By keeping all the original tracks and can always do this. Just a personal thing.

I dove in and found it's using CR and not VBR but that the defaults are pretty week. Using the defaults it's creating something in between a Handbrake "normal" and a handbrake "iPad" type encode. I 've set this up on my side to be able to configure/tune the 264 engine such as profile, x.264 level, preset, CRF or VBR with max bitrates and added more control over threading. In a nutshell this will allow it to replace Handbrake for all the normal encode profiles it uses and then some.

I also added faststart and removed the MOOV and when doing remixing it's quite faster this way. I've ran about 50 different types of files through it and all look good so far. If you can remember what type of problem you had with it before I'll check it out.

I noticed if you transcode a file that's not 29.97 frame rate that it doesn't do the best job as you get a "transfer" as the frame rates are different from source to destination. Maybe this is what you experience previously when playing with CR. I'll tackle that one in a bit along with a few other things that could/should be automated for better transcode results.

Later tonight I want to throw an hour or so at it and set it up to do automatic distributed processing so multiple computers if available can share the load. A "Cluster" if you will. This will then allow me to throw it into production on my side and be able to QA the results well and "tune/fix" any problems.

I'll play around for a couple of days and give you back the code and you can look it over and decide if you want to use any of it. You may or may not depending on the particular change (and why).

Thanks again for all your help, Carlo

PS Feel free to close this issue out any time you wish as all the "original" stuff was handled nicely and what I'm doing now is beyond this issue. :)

mdhiggins commented 9 years ago

I'll keep it open as I'm curious to see how you implement the changes you're talking about. I know some of the things you've discussed, (such as using FAAC) can already be done with a properly configured autoProcess.ini but a lot of it are things I haven't bothered to deal with since I think the vast majority of users aren't looking to fine tune things that much. But definitely share once you get it up and running so I can take a look.

cayars commented 9 years ago

Sounds good Michael.

zybeon commented 9 years ago

I'm interested in seeing the changes you made if you want to share.