pro100andrey / lame

Swift Lame Framework
MIT License
35 stars 7 forks source link

Facing issue converting an m4a to mp3 #12

Closed tryWabbit closed 1 year ago

tryWabbit commented 1 year ago

Thanks for creating this library!

I'm trying to convert an m4a file to mp3 but I'm getting corrupt audio. I'm using the example project and i just replaced the input file to my m4a file

In the debugger I'm getting

void * _Nullable NSMapGet(NSMapTable * _Nonnull, const void * _Nullable): map table argument is NULL

the file is exported but it is corrupt. Any suggestions or feedbacks what I'm doing wrong?

pro100andrey commented 1 year ago

Hi @tryWabbit, You are not decoding the M4A audio into raw audio data before sending it into LAME.

"PCM audio" means no compression.

tryWabbit commented 1 year ago

Thanks for the quick reply @pro100andrey ! I have limited experience with audio conversions. Can you explain me in detail what updates I need to make to get an mp3 file out of a m4a file. Really appreaciate your help!

pro100andrey commented 1 year ago

Certainly, @tryWabbit! Converting an M4A file to MP3 involves a few steps. I'll explain in detail:

Thanks!

pro100andrey commented 1 year ago

@tryWabbit

If you're recording audio directly from a microphone and want to save it in MP3 format, you can use the LAME library to convert PCM audio to MP3 on the fly. LAME provides the capability to encode audio into MP3 format while recording or streaming.

For example AURenderCallback

tryWabbit commented 1 year ago

Thank you so much for the detailed explanation @pro100andrey Really appreciate your help! I will try and update you

tryWabbit commented 1 year ago

Hey @pro100andrey I tried your solution and it worked! The only problem I have right now is the converted audio is half of the size and playing like it's in fast forward. I'm sure there is a difference between the configuration I'm converting it in and your configs for conversion. Can you verify?

Method I'm using to convert m4a to wav. The output is fine and match the audio.

    func convertAudio(_ url: URL, outputURL: URL) {
        var error : OSStatus = noErr
        var destinationFile : ExtAudioFileRef? = nil
        var sourceFile : ExtAudioFileRef? = nil

        var srcFormat : AudioStreamBasicDescription = AudioStreamBasicDescription()
        var dstFormat : AudioStreamBasicDescription = AudioStreamBasicDescription()

        ExtAudioFileOpenURL(url as CFURL, &sourceFile)

        var thePropertySize: UInt32 = UInt32(MemoryLayout.stride(ofValue: srcFormat))

        ExtAudioFileGetProperty(sourceFile!,
            kExtAudioFileProperty_FileDataFormat,
            &thePropertySize, &srcFormat)

        dstFormat.mSampleRate = 44100  //Set sample rate
        dstFormat.mFormatID = kAudioFormatLinearPCM
        dstFormat.mChannelsPerFrame = 1
        dstFormat.mBitsPerChannel = 16
        dstFormat.mBytesPerPacket = 2 * dstFormat.mChannelsPerFrame
        dstFormat.mBytesPerFrame = 2 * dstFormat.mChannelsPerFrame
        dstFormat.mFramesPerPacket = 1
        dstFormat.mFormatFlags = kLinearPCMFormatFlagIsPacked |
        kAudioFormatFlagIsSignedInteger

        // Create destination file
        error = ExtAudioFileCreateWithURL(
            outputURL as CFURL,
            kAudioFileWAVEType,
            &dstFormat,
            nil,
            AudioFileFlags.eraseFile.rawValue,
            &destinationFile)
        reportError(error: error)

        error = ExtAudioFileSetProperty(sourceFile!,
                kExtAudioFileProperty_ClientDataFormat,
                thePropertySize,
                &dstFormat)
        reportError(error: error)

        error = ExtAudioFileSetProperty(destinationFile!,
                                         kExtAudioFileProperty_ClientDataFormat,
                                        thePropertySize,
                                        &dstFormat)
        reportError(error: error)

        let bufferByteSize : UInt32 = 32768
        var srcBuffer = [UInt8](repeating: 0, count: 32768)
        var sourceFrameOffset : ULONG = 0

        while(true){
            var fillBufList = AudioBufferList(
                mNumberBuffers: 1,
                mBuffers: AudioBuffer(
                    mNumberChannels: 2,
                    mDataByteSize: UInt32(srcBuffer.count),
                    mData: &srcBuffer
                )
            )
            var numFrames : UInt32 = 0

            if(dstFormat.mBytesPerFrame > 0){
                numFrames = bufferByteSize / dstFormat.mBytesPerFrame
            }

            error = ExtAudioFileRead(sourceFile!, &numFrames, &fillBufList)
            reportError(error: error)

            if(numFrames == 0){
                error = noErr;
                break;
            }

            sourceFrameOffset += numFrames
            error = ExtAudioFileWrite(destinationFile!, numFrames, &fillBufList)
            reportError(error: error)
        }

        error = ExtAudioFileDispose(destinationFile!)
        reportError(error: error)
        error = ExtAudioFileDispose(sourceFile!)
        reportError(error: error)
    }

Your method of conversion from wav to mp3

class func encodeToMp3(
        inPcmPath: String,
        outMp3Path: String,
        onProgress: @escaping (Float) -> (Void),
        onComplete: @escaping () -> (Void)
    ) {

        encoderQueue.async {

            let lame = lame_init()
            lame_set_in_samplerate(lame, 44100)
            lame_set_out_samplerate(lame, 0)
            lame_set_brate(lame, 0)
            lame_set_quality(lame, 4)
            lame_set_VBR(lame, vbr_off)
            lame_init_params(lame)

            let pcmFile: UnsafeMutablePointer<FILE> = fopen(inPcmPath, "rb")
            fseek(pcmFile, 0 , SEEK_END)

            let fileSize = ftell(pcmFile)
            // Skip file header.
            let pcmHeaderSize = 48 * 8
            fseek(pcmFile, pcmHeaderSize, SEEK_SET)

            let mp3File: UnsafeMutablePointer<FILE> = fopen(outMp3Path, "wb")

            let pcmSize = 1024 * 8
            let pcmbuffer = UnsafeMutablePointer<Int16>.allocate(capacity: Int(pcmSize * 2))

            let mp3Size: Int32 = 1024 * 8
            let mp3buffer = UnsafeMutablePointer<UInt8>.allocate(capacity: Int(mp3Size))

            var write: Int32 = 0
            var read = 0

            repeat {

                let size = MemoryLayout<Int16>.size * 2
                read = fread(pcmbuffer, size, pcmSize, pcmFile)
                // Progress
                if read != 0 {
                    let progress = Float(ftell(pcmFile)) / Float(fileSize)
                    DispatchQueue.main.sync { onProgress(progress) }
                }

                if read == 0 {
                    write = lame_encode_flush_nogap(lame, mp3buffer, mp3Size)
                } else {
                    write = lame_encode_buffer_interleaved(lame, pcmbuffer, Int32(read), mp3buffer, mp3Size)
                }

                fwrite(mp3buffer, Int(write), 1, mp3File)

            } while read != 0

            // Clean up
            lame_close(lame)
            fclose(mp3File)
            fclose(pcmFile)

            pcmbuffer.deallocate()
            mp3buffer.deallocate()

            DispatchQueue.main.sync { onComplete() }
        }
    }

Can you identify the difference in conversion which is causing the audio to be fast the half of the actual duration ?

Thanks

pro100andrey commented 1 year ago

Hi, @tryWabbit, if your file contains 1 channel, try set lame_set_num_channels to 1

/ number of channels in input stream. default=2 / int CDECL lame_set_num_channels(lame_global_flags *, int);

pro100andrey commented 1 year ago

@tryWabbit I noticed that you can improve your code. You can combine the two functions func convertAudio(_ url:, outputURL:) and func encodeToMp3(inPcmPath:, outMp3Path:, onProgress:, onComplete:)into one. AfterExtAudioFileRead(sourceFile!, &numFrames, &fillBufList), you can directly write fillBufList to MP3 without intermediate saving to a WAV file.

Think about it :).

tryWabbit commented 1 year ago

Hi, @tryWabbit, if your file contains 1 channel, try set lame_set_num_channels to 1

/ number of channels in input stream. default=2 / int CDECL lame_set_num_channels(lame_global_flags *, int);

I set the number of channels to 1 by lame_set_num_channels(lame, 1) but it is still generating audio which is having half of the duration and fast playing

I'm sorry if I don't make sense I don't have any experience with lame and c apis and have limited experience with audio apis for ios.

tryWabbit commented 1 year ago

@tryWabbit I noticed that you can improve your code. You can combine the two functions func convertAudio(_ url:, outputURL:) and func encodeToMp3(inPcmPath:, outMp3Path:, onProgress:, onComplete:)into one. AfterExtAudioFileRead(sourceFile!, &numFrames, &fillBufList), you can directly write fillBufList to MP3 without intermediate saving to a WAV file.

Think about it :).

Thank you so much for suggesting that. I will definitely focus on this once I get it working.

tryWabbit commented 1 year ago

I uploaded a testing project on which I'm doing the experiment here in case you want to see the issue - https://github.com/tryWabbit/Audio-Conversion

pro100andrey commented 1 year ago

@tryWabbit result without issue (replaced lame_encode_buffer_interleaved with lame_encode_buffer with empty right channel.)

//
//  AudioConverter.swift
//  Example
//
//  Created by Andrey on 20.11.2020.
//

import Foundation
import lame

class AudioConverter {

    private static let encoderQueue = DispatchQueue(label: "com.audio.encoder.queue")

    class func encodeToMp3(
        inPcmPath: String,
        outMp3Path: String,

        onProgress: @escaping (Float) -> (Void),
        onComplete: @escaping () -> (Void)
    ) {

        encoderQueue.async {

            let numOfChannels: Int32 = 1

            let lame = lame_init()
            lame_set_in_samplerate(lame, 44100)
            lame_set_out_samplerate(lame, 0)
            lame_set_brate(lame, 0)
            lame_set_quality(lame, 4)
            lame_set_VBR(lame, vbr_off)
            lame_set_num_channels(lame, numOfChannels)
            lame_init_params(lame)

            let pcmFile: UnsafeMutablePointer<FILE> = fopen(inPcmPath, "rb")
            fseek(pcmFile, 0 , SEEK_END)

            let fileSize = ftell(pcmFile)
            // Skip file header.
            let pcmHeaderSize = 48 * 8
            fseek(pcmFile, pcmHeaderSize, SEEK_SET)

            let mp3File: UnsafeMutablePointer<FILE> = fopen(outMp3Path, "wb")

            let pcmSize = 1024 * 8
            let pcmbuffer = UnsafeMutablePointer<Int16>.allocate(capacity: Int(pcmSize * 2))

            let mp3Size: Int32 = 1024 * 8
            let mp3buffer = UnsafeMutablePointer<UInt8>.allocate(capacity: Int(mp3Size))

            var write: Int32 = 0
            var read = 0

            repeat {

                let size = MemoryLayout<Int16>.size * Int(numOfChannels)
                read = fread(pcmbuffer, size, pcmSize, pcmFile)
                // Progress
                if read != 0 {
                    let progress = Float(ftell(pcmFile)) / Float(fileSize)
                    DispatchQueue.main.sync { onProgress(progress) }
                }

                if read == 0 {
                    write = lame_encode_flush_nogap(lame, mp3buffer, mp3Size)
                } else {
                    write = lame_encode_buffer(lame, pcmbuffer, [] ,Int32(read), mp3buffer, mp3Size)
                }

                fwrite(mp3buffer, Int(write), 1, mp3File)

            } while read != 0

            // Clean up
            lame_close(lame)
            fclose(mp3File)
            fclose(pcmFile)

            pcmbuffer.deallocate()
            mp3buffer.deallocate()

            DispatchQueue.main.sync { onComplete() }
        }
    }
}
tryWabbit commented 1 year ago

Thank you so much @pro100andrey the mistake I did was I put the lame_set_num_channels(lame, 1) after the lame_init_params(lame).

Really appreciate your help : )