starkdmi / MediaToolSwift

Advanced Swift library for media conversion and manipulation
https://starkdmi.github.io/MediaToolSwift/documentation/mediatoolswift
Mozilla Public License 2.0
80 stars 8 forks source link

Skip audio compression on bitrate <64kbit/s #1

Closed twilightDD closed 11 months ago

twilightDD commented 11 months ago

Thank you for your work and the MediaToolSwift. It helps me a lot to transcode videos, user import into my app.

As it turns out I have a use-case where the check for a passthrough audio track works different, as a user would expect it.

Sometime, user will import a video that has an audio track already encoded with aac and a bitrate on 32kbit/s. While there is a limitation on the Apple frameworks (does not allow a bitrate lower than 64kbit/s) it seems appropriate to passthrough the source audio stream and skip the compression.

In the method initAudio() in Video.swift there is a check to compare the audio settings.

 // Compare source audio settings with output to possibly skip the compression
let defaultSettings = CompressionAudioSettings()
if audioFormatID == codec.formatId, 
   audioSettings.bitrate == defaultSettings.bitrate, 
   !((audioFormatID == kAudioFormatMPEG4AAC || audioFormatID == kAudioFormatFLAC) 
     && audioSettings.quality != defaultSettings.quality),
   audioSettings.sampleRate == defaultSettings.sampleRate
   {
      variables.hasChanges = false
   }

The thing is, that the check for audioFormatID == kAudioFormatMPEG4AAC will force MediaToolSwift to re-encode the audio track, even the source is already aac but has a lower bit rate than audioSettings.bitrate.

Also, the check for audioSettings.bitrate == defaultSettings.bitrate will (almost every time) fail as default.bitrate is .auto and in most use cases a user will give a bitrate value.


For myself I use something like that as a work around:

// To avoid re-encoding if source has a lower audio bitrate than selected target audio bitrate, check it.
let sourceAudioBitrate = audioTrack!.estimatedDataRate
let dummySourceAudioSettings = CompressionAudioSettings(codec: audioSettings.codec,
                                                           bitrate: .value(Int(sourceAudioBitrate)),
                                                           quality: audioSettings.quality,
                                                           sampleRate: audioSettings.sampleRate)
if audioFormatID == codec.formatId,
   dummySourceAudioSettings.bitrate <= audioSettings.bitrate,
   audioSettings.quality == defaultSettings.quality,
   audioSettings.sampleRate == defaultSettings.sampleRate
   {
      variables.hasChanges = false

      sourceFormatHint = audioDescription
   }

Be aware, that sourceFormatHint = audioDescription is needed in my case.

In AudioBitrate.swift I added an Equatable for <=, too.

public static func <= (lhs: CompressionAudioBitrate, rhs: CompressionAudioBitrate) -> Bool { 
   switch (lhs, rhs) {
      case (.auto, .auto):
         return false
      case (.value(let lhsValue), .value(let rhsValue)):
         return lhsValue <= rhsValue
      default:
         return false
   }
}

Edit: correct check is dummySourceAudioSettings.bitrate <= audioSettings.bitrate,

starkdmi commented 11 months ago

I see, there is a room for improvements. Is it correct what in your case audio track with 128 Kbps will not be re-encoded to 192 Kbps when requested to?

twilightDD commented 11 months ago

Is it correct what in your case audio track with 128 Kbps will not be re-encoded to 192 Kbps when requested to?

Yes, that was my goal and in such a case it will just pass through the audioTrack.

Extension: Right now I try to figure out to do the same thing for the videoTrack. a) Video 900x500 with a bitrate of 500kBit/s and a target resolution of 1280x720 and a target bitrate of 4.000kBit/s. => videoTrack can be passed through. b) Video 1920x1080 with a bitrate of 500kBit/s and a target resolution of 1280x720 and a target bitrate of 4.000kBit/s. => videoTrack has to be scaled, but transcoding should use 500kBit/s as bitrate.

But I will open a new issue for this if I have a "solution", if this is okay for you.

starkdmi commented 11 months ago

The video bitrate could normally be scaled though, when converting from HEVC to H.264 with the same quality you will increase bitrate by about 40%. I decided to not scale up the video resolution or frame rate when requested too, but it seems more logical to me to not apply it here.

@twilightDD, I guess yours A and B options are for the same codec?

Note: Video bitrate is used only for HEVC and H.264, audio bitrate applied only to AAC and OPUS.

twilightDD commented 11 months ago

The video bitrate could normally be scaled though, when converting from HEVC to H.264 with the same quality you will increase bitrate by about 40%. I decided to not scale up the video resolution or frame rate when requested too, but it seems more logical to me to not apply it here.

Yes, I agree with you. The handling on "changing codecs" (i.e. from hevc to h264) should be considered, too.

A scale up of a video track (from 960x540 to 1280x720) could be an option but I'm not sure about a typical use case.

@twilightDD, I guess yours A and B options are for the same codec?

At the moment: Yes.

Note: Video bitrate is used only for HEVC and H.264, audio bitrate applied only to AAC and OPUS.

Thank for pointing that out; at the moment I only offer to choose transcoding to hevc or h264 for the video track and will always use aac as audio codec.

starkdmi commented 11 months ago

I added redunant compression flag for video track based on bitrate, but didn't include source bitrate usage when target is higher.

twilightDD commented 11 months ago

I will try it - tonight or on friday. You will get a feedback!

Thank you so far!

twilightDD commented 11 months ago

Comment removed

twilightDD commented 11 months ago

Tests with target settings: mp4, 1280x1280, 2 Mbit video, 128 kbit audio

  • 2.1. sourceVideo: mp4, 3840x2160, 5,2 Mbit video, 74,58 kbit audio (same file as 1.4) -> MediaCompressor Error: Couldn't add audio to writer This I don't understand at the moment.

Now I remember - I had a the same problems on my very first try with your library!

In Video.swift method initAudio() (last lines):

      variables.audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioParameters,
                                                sourceFormatHint: sourceFormatHint)
    }
    return variables

sourceFormatHintis empty at this moment (and with these settings). So I replaced it by audioDescriptionand it works!

      variables.audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioParameters,
                                                sourceFormatHint: audioDescription)
    }
    return variables

transcoder tests with settings: mp4, 1280x1280, 2Mbit video, 128kbit audio

twilightDD commented 11 months ago

Ha! I just saw your commit: "added a commit that referenced this issue 1 hour ago"

I will try that now ;)

twilightDD commented 11 months ago

Tests with target settings: mp4, 1280x1280, 2 Mbit video, 64 kbit audio

Tests with target settings: mp4, 1280x1280, 2 Mbit video, 128 kbit audio

twilightDD commented 11 months ago

I'm just thinking about the error MediaCompressor Error: Compression is redunant - video and audio won't be modified during the compression.

Is this really an error? Or maybe something like an indication that no encoding took place because it didn't make sense?

Is it a good idea to extend the CompressionState to include a case like "compressionRedundant"?

starkdmi commented 11 months ago

@twilightDD, first of all, thank you for you descriptive responses) I'd like to get those test files if possible, so I could add them to tests and iterate from there (I'll try to use just specs anyway).

In 1.1 the progress based on video frames proceed, but I didn't get why this can get over 1.0 when video isn't re-encoded as it still writes video to file as it is.

In 1.4 the file succeed encoding video and audio but in 2.1 fails, interesting to see that file as minimum. Setting sourceFormatHint in some cases to nil was intended, need to re-check your case.

I don't see the CompressionRedundant as a state, but more like indicator of compression what make no sense. Importantly the compression/copying is not done, so file at destination is not created - more like an error to me, could be changed in future though.

twilightDD commented 11 months ago

@twilightDD, first of all, thank you for you descriptive responses) I'd like to get those test files if possible, so I could add them to tests and iterate from there (I'll try to use just specs anyway).

Well, the test videos I use are from different sources and (to be honest) not really free from copyrights or private issues 😶‍🌫️. I will send you an e-mail to discuss this, okay?

I don't see the CompressionRedundant as a state, but more like indicator of compression what make no sense. Importantly the compression/copying is not done, so file at destination is not created - more like an error to me, could be changed in future though.

Good point. It was just a glimpse.

starkdmi commented 11 months ago

the test videos I use are from different sources and (to be honest) not really free from copyrights or private issues

No problem, I have a lot video files already, so I'll use them first, then we'll see.

starkdmi commented 11 months ago

The progress is based on proceed frames / source frames, and the source frames amount is calculated via:

var totalFrames = Int64(ceil(durationInSeconds * Double(nominalFrameRate)))

There might be a little difference in those. You could actually compare the ffmpeg response:

ffprobe -v error -select_streams v:0 -count_frames -show_entries stream=nb_read_frames -print_format csv video.mp4

with calculated value in progress callback:

print(progress.totalUnitCount)

For now I see as an option to skip the progress values higher than 1.0 internally.

starkdmi commented 11 months ago

@twilightDD, there is a way to get a better description for the case 2.1 related to MediaCompressor Error: Couldn't add audio to writer. By replacing this code block:

// Append audio to writer
guard writer.canAdd(audioVariables.audioInput!) else {
    callback(.failed(CompressionError.failedToWriteAudio))
    return task
}
writer.add(audioVariables.audioInput!)

from Video.swift with this:

do {
    try ObjCExceptionCatcher.catchException {
        writer.add(audioVariables.audioInput!)
    }
} catch let error {
    callback(.failed(error))
    return task
}

I wasn't used this in final library code due to Apple suggestions to use writer.canAdd(:) first, but in Swift it gives you no information.

twilightDD commented 11 months ago

@twilightDD, there is a way to get a better description for the case 2.1 related to MediaCompressor Error: Couldn't add audio to writer. By replacing this code block:

Well, the thing is - at the moment I don't use your code as package or cocoapod* and Xcode does not found ObjCExceptionCatcher at all, I had to comment these lines out -.-

But I was able to use your example project to reproduce this error.

Error: *** -[AVAssetWriter addInput:] In order to perform passthrough to file type public.mpeg-4,
please provide a format hint in the AVAssetWriterInput initializer

I caught this error a little while ago and got the idea to replace sourceFormatHint by audioDescription as I mentioned before.


*) I had some troubles with cocoapods in the past and the most annoying thing about SPM is that I can't include the packages into my repo. Maybe you have an idea to solve the latter?

starkdmi commented 11 months ago

I've tested videos and can replicate the progress issue. In Swift 134 frames detected, ffmpeg shows 133, but the callback is actually called 137 times. As it will always be close to the original value I could limit progress notifications to 1.0 max. Will investigate a bit more there.

With the second file I have different errors than you. Audio encoding is skipped (due to an internal error) when using aac with lower bitrate, but should be applied, opus works fine. The logs for aac:

[AQ] AudioQueueObject.cpp:2958  _Start: Error (-4) getting reporterIDs
[AudioMetadata] AudioQueueObject.cpp:5919  DetermineMetadataTimestamp unable to obtain valid timestamp, estimate:412148.000000

Which possibly indicate an invalid audio track properties (even knowing you used QuickTime and HandBrake).

I also tried to set sourceFormatHint to audioDescription and the audio track just fails to be added.

starkdmi commented 11 months ago

Without using ObjCExceptionCatcher the project with library code will crash at runtime.

Probably you could fork a repo and use it in CocoaPods via:

pod 'MediaToolSwift', :git => 'https://github.com/twilightDD/MediaToolSwift.git', :branch => 'main'

or in SPM using:

dependencies: [
    .package(url: "https://github.com/twilightDD/MediaToolSwift.git", branch: "main")
]
twilightDD commented 11 months ago

I've tested videos and can replicate the progress issue. In Swift 134 frames detected, ffmpeg shows 133, but the callback is actually called 137 times. As it will always be close to the original value I could limit progress notifications to 1.0 max. Will investigate a bit more there.

This are bad ugly effects I hate the most - "its almost correct this way but actually not". I feel you.


I also tried to set sourceFormatHint to audioDescription and the audio track just fails to be added.

Well, I just re-tried it with following settings:

with sourceFormatHint the error occurs, with audioDescription it encodes

sourceFormatHint: is nil

audioDescription:

audioDescription: Optional(<CMAudioFormatDescription 0x600003305040 [0x7ff86002ecc0]> {
    mediaType:'soun' 
    mediaSubType:'aac ' 
    mediaSpecific: {
        ASBD: {
            mSampleRate: 44100.000000 
            mFormatID: 'aac ' 
            mFormatFlags: 0x0 
            mBytesPerPacket: 0 
            mFramesPerPacket: 1024 
            mBytesPerFrame: 0 
            mChannelsPerFrame: 2 
            mBitsPerChannel: 0  } 
        cookie: {<CFData 0x60000292ea70 [0x7ff86002ecc0]>{length = 39, capacity = 39, bytes = 0x03808080220002000480808014401500 ... 1210068080800102}} 
        ACL: {Stereo (L R)}
        FormatList Array: {
            Index: 0 
            ChannelLayoutTag: 0x650002 
            ASBD: {
            mSampleRate: 44100.000000 
            mFormatID: 'aac ' 
            mFormatFlags: 0x0 
            mBytesPerPacket: 0 
            mFramesPerPacket: 1024 
            mBytesPerFrame: 0 
            mChannelsPerFrame: 2 
            mBitsPerChannel: 0  }} 
    } 
    extensions: {{
    VerbatimISOSampleEntry = {length = 87, bytes = 0x00000057 6d703461 00000000 00000001 ... 12100680 80800102 };
}}
})
twilightDD commented 11 months ago

Without using ObjCExceptionCatcher the project with library code will crash at runtime.

Probably you could fork a repo and use it in CocoaPods via:

pod 'MediaToolSwift', :git => 'https://github.com/twilightDD/MediaToolSwift.git', :branch => 'main'

or in SPM using:

dependencies: [
    .package(url: "https://github.com/twilightDD/MediaToolSwift.git", branch: "main")
]

Yeah, thats the way. But does not help me with the fact that I can't include the SPM-package to my repo (what I really prefer. Sometimes I'm out of nowhere and do not have any internet connectivity. Just Germany problems one may say.)

starkdmi commented 11 months ago

@twilightDD, clone Github repo, then in Xcode menu File->Add Package Dependencies, and button Add Local at the bottom. It will recognize Package.swift and you will be able to modify the code.

starkdmi commented 11 months ago

@twilightDD, setting sourceFormatHint = audioDescription for the audio without re-encoding make sense, fixed on main branch.