livekit / client-sdk-android

LiveKit SDK for Android
https://docs.livekit.io
Apache License 2.0
176 stars 69 forks source link

Issues with capturePostProcessor's processAudio Functionality #398

Closed balazsbanto closed 5 months ago

balazsbanto commented 6 months ago

Describe the bug When attempting to apply custom post-processing to the microphone by overriding capturePostProcessor through LiveKitOverrides, issues arise with the audio output. Using simple gain modification as an example, I encounter one of two problems: If I do not adjust the endianness of the buffer, I only hear bursts of noise. If I set the endianness to ByteOrder.LITTLE_ENDIAN, the voice becomes somewhat audible but is still very noisy.

To Reproduce Steps to reproduce the behavior: Modify the sample-app by overriding the capturePostProcessor:

CallViewModel(
            url = args.url,
            token = args.token,
            e2ee = args.e2eeOn,
            e2eeKey = args.e2eeKey,
            stressTest = args.stressTest,
            application = application,
            audioProcessorOptions = AudioProcessorOptions(
                capturePostProcessor = object : AudioProcessorInterface {
                    override fun isEnabled(): Boolean {
                        return true
                    }

                    override fun getName(): String {
                        return ""
                    }

                    // It's called with sampleRateHz = 48000, numChannels = 1
                    override fun initializeAudioProcessing(sampleRateHz: Int, numChannels: Int) {

                    }

                    override fun resetAudioProcessing(newRate: Int) {

                    }

                    // It's called with numBands = 3, numFrames = 480
                    override fun processAudio(numBands: Int, numFrames: Int, buffer: ByteBuffer) {
                        val numSamples =  numFrames // numChannels is 1

                        // If I don't set the byte order to LITTLE_ENDIAN I only hear burst of noise when I speak.
                        // This way my voice is audible but still very noisy
                        buffer.order(ByteOrder.LITTLE_ENDIAN)

                        for (i in 0 until numSamples) {
                            val currentPos = buffer.position()
                            // Assuming ENCODING_PCM_16BIT
                            val sample = buffer.getShort()

                            val processedSample = (sample * 0.01f ) // tried with other multipliers too

                            buffer.position(currentPos)
                            buffer.putShort(processedSample.toInt().toShort())
                        }
                    }

                },
                capturePostBypass = false,
            ),

            )

Expected behavior The voice volume should be noticeably lower when the capturePostProcessor is enabled, without introducing noise or altering the quality of the audio.

Device Info:

Additional context Adjusting the buffer's endianness was an attempt to mitigate the noise issue, but it only partially improves the audio quality, leaving it still very noisy.

davidliu commented 5 months ago

Hey, I think these may be floats actually. Explains why the capacity is 4 * number of frames. Not sure why they're floats, since it should be PCM_16bit by default, but I guess it gets turned into floats around this step of the process.

Changing the get/putShorts to use floats instead works.

balazsbanto commented 5 months ago

Hey, indeed it's strange that the format is float, anyways, thanks for the update! Please ket me know if you figure it out from where does the format come from!