anarchuser / mic_stream

Flutter plugin to get an infinite audio stream from the microphone
https://pub.dev/packages/mic_stream
GNU General Public License v3.0
100 stars 68 forks source link

MacOS example making incorrect waveform / bug in transform code #94

Open larsmob opened 5 months ago

larsmob commented 5 months ago

I was playing around with the MacOS example. First, I noticed that the waveform seemed very gained/amplified. Then I noticed the data is actually wrapping around - one example: -27649.0, -28673.0, -29953.0, -31489.0, 32767.0, 31999.0, 31743.0, 32511.0, -31489.0, -29441.0, Something is wrong.

I started by looking at your swift code. I verified the waveform looks good here: let clamped = min(max(-2.0, val), 2.0) But you could consider clamping at +/-4 instead. (Not sure what is supposed to be max, but I can easily produce values even higher than 4).

Then I removed the .transform(MicStream.toSampleStream) and looked at the data - the bytes are swapped (LSB comes first). So then I wrote this to handle byte swapping and the two's complement representation:

   int UInt16Max = pow(2, 16).toInt();
  void _raw(samples) {
    List<int> result = [];
    for (var i = 0; i < samples.length~/2; i++) {
      int a = samples[2*i + 1];
      int b = samples[2*i];
      int c = 256*a + b;
      if (2*c > UInt16Max) {
        c = -UInt16Max + c;
      }
      result.add(c);
    }
    // result now has correct data
    ...
  }

  stream = MicStream.microphone(
        audioSource: AudioSource.DEFAULT,
        sampleRate: AppDefaults.fs,
        channelConfig: ChannelConfig.CHANNEL_IN_MONO,
        audioFormat: AudioFormat.ENCODING_PCM_16BIT);
  listener = stream!
    .listen(_raw);

Data looks good now. I haven't looked carefully at your code, but something is quite wrong with the transform.

Thanks for a very nice piece of software.

anarchuser commented 5 months ago

Thanks for the heads-up. This is quite likely, I can only really provide Android support and know nothing about the iOS and MacOS parts. That is the reason why I kept the API as crude as-is. The transform was my first try of a QoL feature, but apparently it's not so simple.

That said, I need to have a look at the docs. Ideally, the MacOS/iOS backends ensure the data we get is little endian, especially if it can vary between devices (as is the case for Android). If the platforms guarantee a specific endiannes, all the better.

Any help from your side to that regard is greatly appreciated!

larsmob commented 5 months ago

I don't think you should worry about keeping the API crude. Audio hardly consumes any CPU these days. I'm astonished stuff is showing up big endian on very little endian platforms - not to mention vary between android devices. Maybe it could make sense to always just transfer out of native code as int8 on all platforms and accept a bit of overhead in Dart code to get what you want.

anarchuser commented 5 months ago

you misunderstand. The issue is not processing power, the issue is my lack of knowledge and testability of Apple's operating systems. Besides, the whole idea of this setup is that the native code abstracts the complexity of the platform while the dart part provides a uniform interface. This would ideally be achieved by having each platform return audio in one specified way.

Regarding Big Endian on Android: The developers of the JVM were on crack, I believe. Point is, Java uses BE internally.

larsmob commented 5 months ago

Sparc is big-endian, not sure how all that played out internally at Sun.