hendriks73 / ffsampledsp

FFmpeg based service provider for javax.sound.sampled.
GNU Lesser General Public License v2.1
24 stars 5 forks source link

Issue converting mp3 to 8kHz MULAW format #19

Closed ovistoica closed 1 year ago

ovistoica commented 1 year ago

Hello & thank you for this great package!

I'm writing a phone playback service which requires I send base64 encoded MULAW 8000 Hz sample rate audio format as the message to be played.

From the twilio docs

{
  "mediaFormat": {
    "encoding": "audio/x-mulaw",
    "sampleRate": 8000,
    "channels": 1
  }
}

Issue

When I try to convert, I always get unsupported format. Initial mp3 format "MPEG-1, Layer 3 24000.0 Hz, unknown bits per sample, mono, unknown frame size, 41.666668 frames/second"

Things I tried

Converting using the format directly does not work

If I go from MP3 to my MULAW 8kHz, I get an unsupported conversion error.

Converting to PCM_SIGNED, then to ULAW 24000 HZ then to ULAW 8000 HZ

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.File;
import java.io.IOException;

class Converter {
    private static final AudioFormat phoneFormat = new AudioFormat(AudioFormat.Encoding.ULAW, 8000, 8, 1, 1, -1, false);

    static public AudioInputStream convertToMulaw(AudioInputStream stream) {
        AudioInputStream signed = AudioSystem.getAudioInputStream(AudioFormat.Encoding.PCM_SIGNED, stream);
        AudioInputStream ulaw24khz = AudioSystem.getAudioInputStream(AudioFormat.Encoding.ULAW, signed);
        return AudioSystem.getAudioInputStream(phoneFormat, ulaw24khz);
    }

    public static void main(String[] args) {

        try {
            File mp3 = new File("resources/speech.mp3");
            //  format `"MPEG-1, Layer 3 24000.0 Hz, unknown bits per sample, mono, unknown frame size, 41.666668 frames/second"`
            AudioInputStream mp3Stream = AudioSystem.getAudioInputStream(mp3);
            AudioInputStream ulawStream = Converter.convertToMulaw(mp3Stream);
            AudioSystem.write(ulawStream, AudioSystem.getAudioFileFormat(mp3).getType(), new File("resources/speech.wav"));
        } catch (IOException | UnsupportedAudioFileException e) {
            return;
        }

    }
}

Output:

Nov 10, 2023 3:46:13 PM com.tagtraum.ffsampledsp.FFNativeLibraryLoader arch
INFO: Using arch=aarch64
[mp3 @ 0x1430e4000] Estimating duration from bitrate, this may be inaccurate
[mp3 @ 0x141808200] Estimating duration from bitrate, this may be inaccurate
Exception in thread "main" java.lang.IllegalArgumentException: Unsupported conversion: ULAW 8000.0 Hz, 8 bit, mono, 1 bytes/frame, unknown frame rate from ULAW 24000.0 Hz, 8 bit, mono, 1 bytes/frame
    at java.desktop/javax.sound.sampled.AudioSystem.getAudioInputStream(AudioSystem.java:894)
    at Converter.convertToMulaw(scratch_7.java:14)
    at Converter.main(scratch_7.java:23)

Question

Do you have any idea how I should go about it? I know that this should be supported by ffmpeg but it may be that this sample rate is not supported by the library

Can you point me in the right direction? I have been stuck on this issue for the last 4 days.

hendriks73 commented 1 year ago

Have you tried downsampling to 8khz before switching the encoding to ulaw?

ovistoica commented 1 year ago

Apparently this does not work either:

java.lang.IllegalArgumentException: Unsupported conversion: MPEG-1, Layer 3 8000.0 Hz, unknown bits per sample, mono, unknown frame size, 41.666668 frames/second
 from MPEG-1, Layer 3 24000.0 Hz, unknown bits per sample, mono, unknown frame size, 41.666668 frames/second
hendriks73 commented 1 year ago

This looks like you tried to downsample mp3 before converting to pcm. I meant, first convert to pcm, then downsample, then switch encodings to ulaw.

hendriks73 commented 1 year ago

I did some digging. This should work:

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.File;
import java.io.IOException;

import static javax.sound.sampled.AudioFormat.Encoding.PCM_SIGNED;

public class Ulaw {
    public static void main(String[] args) throws UnsupportedAudioFileException, IOException {
        final AudioInputStream audioInputStream0 = AudioSystem.getAudioInputStream(new File("audio.mp3"));

        final AudioInputStream audioInputStream1 = toSignedPCM(audioInputStream0);
        final AudioInputStream audioInputStream2 = toMono(audioInputStream1);
        final AudioInputStream audioInputStream3 = to8kHz(audioInputStream2);
        final AudioInputStream audioInputStream4 = toULAW(audioInputStream3);
        System.out.println(audioInputStream4.getFormat());
    }

    private static AudioInputStream toSignedPCM(final AudioInputStream audioInputStream) {
        final AudioFormat format = audioInputStream.getFormat();

        final int sampleSizeInBits = format.getSampleSizeInBits() > 0 ? format.getSampleSizeInBits() : 16;
        final int channels = Math.min(format.getChannels(), 2);
        final int frameSize = format.getFrameSize() > 0 ? format.getFrameSize() : sampleSizeInBits * channels / 8;

        final AudioFormat format1 = new AudioFormat(
                PCM_SIGNED,
                format.getSampleRate(),
                sampleSizeInBits,
                channels,
                frameSize,
                format.getSampleRate(),
                format.isBigEndian()
        );
        return AudioSystem.getAudioInputStream(format1, audioInputStream);
    }

    private static AudioInputStream toMono(final AudioInputStream audioInputStream) {
        final AudioFormat format = audioInputStream.getFormat();

        final int frameSize = format.getSampleSizeInBits() / 8;
        final AudioFormat format1 = new AudioFormat(
                PCM_SIGNED,
                format.getSampleRate(),
                format.getSampleSizeInBits(),
                1,
                frameSize,
                format.getSampleRate(),
                format.isBigEndian()
        );
        return AudioSystem.getAudioInputStream(format1, audioInputStream);
    }

    private static AudioInputStream to8kHz(final AudioInputStream audioInputStream) {
        final AudioFormat format = audioInputStream.getFormat();
        float sampleRate = 8000;

        final AudioFormat format1 = new AudioFormat(
                PCM_SIGNED,
                sampleRate,
                format.getSampleSizeInBits(),
                format.getChannels(),
                format.getFrameSize(),
                sampleRate,
                format.isBigEndian()
        );
        return AudioSystem.getAudioInputStream(format1, audioInputStream);
    }

    private static AudioInputStream toULAW(final AudioInputStream audioInputStream) {
        final AudioFormat format = audioInputStream.getFormat();

        final AudioFormat format1 = new AudioFormat(
                AudioFormat.Encoding.ULAW,
                format.getSampleRate(),
                8,
                format.getChannels(),
                format.getChannels(),
                format.getFrameRate(),
                format.isBigEndian()
        );
        return AudioSystem.getAudioInputStream(format1, audioInputStream);
    }

}
ovistoica commented 1 year ago

You are a life saver, sir! This is exactly what I needed! Thank you so much!