microsoft / MixedRealityToolkit-Unity

This repository is for the legacy Mixed Reality Toolkit (MRTK) v2. For the latest version of the MRTK please visit https://github.com/MixedRealityToolkit/MixedRealityToolkit-Unity
https://aka.ms/mrtkdocs
MIT License
6k stars 2.12k forks source link

MicStreamSelector behaving inconsitently #9717

Closed bruhnth17 closed 8 months ago

bruhnth17 commented 3 years ago

Describe the bug

I am working on a collaborative MR environment in Unity, where people with different Headsets can work in the same Unity scene. A requirement for that is audio communication - Something we want to handle over UDP in C# Scripts.

To get the microphone data I am using the MicStreamSelector dll from the Mixed Reality Toolkit Microphone Stream Selector. But for me this has the weirdest behavior. Sometimes MicGetFrame works, sometimes it does not. Not working, means that MicStreamSelector provides an empty byte frame (I have no error messages for you 😢) instead of the real audio data. A weird thing I have noticed is that there is a pattern in the randomness. If it works it is much more likely to work the next few times, if it does not work it will probably not work the next times.

To reproduce

MicStream.cs

using System.Runtime.InteropServices;
using System.Text;
using UnityEngine;

public class MicStream
{
    // This class replaces Unity's Microphone object.
    // This class is made for HoloLens mic stream selection but should work well on all Windows 10 devices.
    // Choose from one of three possible microphone modes on HoloLens.
    // There is an example of how to use this script in HoloToolkit-Tests\Input\Scripts\MicStreamDemo.cs.

    // Streams: LOW_QUALITY_VOICE is optimized for speech analysis.
    //          COMMUNICATIONS is higher quality voice and is probably preferred.
    //          ROOM_CAPTURE tries to get the sounds of the room more than the voice of the user.
    // This can only be set on initialization.
    public enum StreamCategory { LOW_QUALITY_VOICE, HIGH_QUALITY_VOICE, ROOM_CAPTURE }

    public enum ErrorCodes { ALREADY_RUNNING = -10, NO_AUDIO_DEVICE, NO_INPUT_DEVICE, ALREADY_RECORDING, GRAPH_NOT_EXIST, CHANNEL_COUNT_MISMATCH, FILE_CREATION_PERMISSION_ERROR, NOT_ENOUGH_DATA, NEED_ENABLED_MIC_CAPABILITY };

    const int MAX_PATH = 260; // 260 is maximum path length in windows, to be returned when we MicStopRecording

    [UnmanagedFunctionPointer(CallingConvention.StdCall)] // If included in MicStartStream, this callback will be triggered when audio data is ready. This is not the preferred method for Game Engines and can probably be ignored.
    public delegate void LiveMicCallback();

    /// <summary>
    /// Called before calling MicStartStream or MicstartRecording to initialize microphone
    /// </summary>
    /// <param name="category">One of the entries in the StreamCategory enumeration</param>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicInitializeDefault(int category);

    /// <summary>
    /// Called before calling MicStartStream or MicstartRecording to initialize microphone
    /// </summary>
    /// <param name="category">One of the entries in the StreamCategory enumeration</param>
    /// <param name="samplerate">Desired number of samples per second</param>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicInitializeCustomRate(int category, int samplerate);

    /// <summary>
    /// Call this to start receiving data from a microphone. Then, each frame, call MicGetFrame.
    /// </summary>
    /// <param name="keepData">If true, all data will stay in the queue, if the client code is running behind. This can lead to significant audio lag, so is not appropriate for low-latency situations like real-time voice chat.</param>
    /// <param name="previewOnDevice">If true, the audio from the microphone will be played through your speakers.</param>
    /// <param name="micsignal">Optional (can be null): This callback will be called when data is ready for MicGetFrame</param>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicStartStream(bool keepData, bool previewOnDevice, LiveMicCallback micsignal);

    /// <summary>
    /// Call this to start receiving data from a microphone. Then, each frame, call MicGetFrame.
    /// </summary>
    /// <param name="keepData">If true, all data will stay in the queue, if the client code is running behind. This can lead to significant audio lag, so is not appropriate for low-latency situations like real-time voice chat.</param>
    /// <param name="previewOnDevice">If true, the audio from the microphone will be played through your speakers.</param>
    /// <returns>error code or 0</returns>
    public static int MicStartStream(bool keepData, bool previewOnDevice)
    {
        return MicStartStream(keepData, previewOnDevice, null);
    }

    /// <summary>
    /// Shuts down the connection to the microphone. Data will not longer be received from the microphone.
    /// </summary>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicStopStream();

    /// <summary>
    /// Begins recording microphone data to the specified file.
    /// </summary>
    /// <param name="filename">The file will be saved to this name. Specify only the wav file's name with extensions, aka "myfile.wav", not full path</param>
    /// <param name="previewOnDevice">If true, will play mic stream in speakers</param>
    /// <returns></returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicStartRecording(string filename, bool previewOnDevice);

    /// <summary>
    /// Finishes writing the file recording started with MicStartRecording.
    /// </summary>
    /// <param name="sb">returns the full path to the recorded audio file</param>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern void MicStopRecording(StringBuilder sb);

    /// <summary>
    /// Finishes writing the file recording started with MicStartRecording.
    /// </summary>
    /// <returns>the full path to the recorded audio file</returns>
    public static string MicStopRecording()
    {
        StringBuilder builder = new StringBuilder(MAX_PATH);
        MicStopRecording(builder);
        return builder.ToString();
    }

    /// <summary>
    /// Cleans up data associated with microphone recording. Counterpart to MicInitialize*
    /// </summary>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicDestroy();

    /// <summary>
    /// Pauses streaming of microphone data to MicGetFrame (and/or file specified with MicStartRecording)
    /// </summary>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]

    public static extern int MicPause();

    /// <summary>
    /// Unpauses streaming of microphone data to MicGetFrame (and/or file specified with MicStartRecording)
    /// </summary>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicResume();

    /// <summary>
    /// Sets amplification factor for microphone samples returned by MicGetFrame (and/or file specified with MicStartRecording)
    /// </summary>
    /// <param name="g">gain factor</param>
    /// <returns>error code or 0</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicSetGain(float g);

    /// <summary>
    /// Queries the default microphone audio frame sample size. Useful if doing default initializations with callbacks to know how much data it wants to hand you.
    /// </summary>
    /// <returns>the number of samples in the default audio buffer</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    private static extern int MicGetDefaultBufferSize();

    /// <summary>
    /// Queries the number of channels supported by the microphone.  Useful if doing default initializations with callbacks to know how much data it wants to hand you.
    /// </summary>
    /// <returns>the number of channels</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    private static extern int MicGetDefaultNumChannels();

    /// <summary>
    /// Read from the microphone buffer. Usually called once per frame.
    /// </summary>
    /// <param name="buffer">the buffer into which to store the microphone audio samples</param>
    /// <param name="length">the length of the buffer</param>
    /// <param name="numchannels">the number of audio channels to store in the buffer</param>
    /// <returns>error code (or 0 if no error)</returns>
    [DllImport("MicStreamSelector", ExactSpelling = true)]
    public static extern int MicGetFrame(float[] buffer, int length, int numchannels);

    /// <summary>
    /// Prints useful error/warning messages based on error codes returned from the functions in this class
    /// </summary>
    /// <param name="returnCode">An error code returned by another function in this class</param>
    /// <returns>True if no error or warning message was printed, false if a message was printed</returns>
    public static bool CheckForErrorOnCall(int returnCode)
    {
        switch (returnCode)
        {
            case (int)ErrorCodes.ALREADY_RECORDING:
                Debug.LogWarning("WARNING: Tried to start recording when you were already doing so. You need to stop your previous recording before you can start again.");
                return false;
            case (int)ErrorCodes.ALREADY_RUNNING:
                Debug.LogWarning("WARNING: Tried to initialize microphone more than once");
                return false;
            case (int)ErrorCodes.GRAPH_NOT_EXIST:
                Debug.LogError("ERROR: Tried to do microphone things without a properly initialized microphone. \n Do you have a mic plugged into a functional audio system and did you call MicInitialize() before anything else ??");
                return false;
            case (int)ErrorCodes.NO_AUDIO_DEVICE:
                Debug.LogError("ERROR: Tried to start microphone, but you don't appear to have a functional audio device. check your OS audio settings.");
                return false;
            case (int)ErrorCodes.NO_INPUT_DEVICE:
                Debug.LogError("ERROR: Tried to start microphone, but you don't have one plugged in, do you?");
                return false;
            case (int)ErrorCodes.CHANNEL_COUNT_MISMATCH:
                Debug.LogError("ERROR: Microphone had a channel count mismatch internally on device. Try setting different mono/stereo options in OS mic settings.");
                return false;
            case (int)ErrorCodes.FILE_CREATION_PERMISSION_ERROR:
                Debug.LogError("ERROR: Didn't have access to create file in Music library. Make sure permissions to write to Music library are set granted.");
                return false;
            case (int)ErrorCodes.NOT_ENOUGH_DATA:
                // usually not an error, means the device hasn't produced enough data yet because it just started running
                Debug.LogWarning("WARNING: The device hasn't produced enough data yet because it just started running.");
                return false;
            case (int)ErrorCodes.NEED_ENABLED_MIC_CAPABILITY:
                Debug.LogError("ERROR: Seems like you forgot to enable the microphone capabilities in your Unity permissions");
                return false;
        }

        if (returnCode != 0)
            Debug.Log($"returnCode: {returnCode}");

        return true;
    }
}

MicrophoneTransmitter.cs

// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License. See LICENSE in the project root for license information.

using System;
using System.Globalization;
using System.Threading;
using UnityEngine;
using static MicStream;

/// <summary>
/// Transmits data from your microphone to other clients connected to a SessionServer. Requires any receiving client to be running the MicrophoneReceiver script.
/// </summary>
[RequireComponent(typeof(AudioSource))]
public class MicrophoneTransmitter : MonoBehaviour
{
    private AudioSource audioSource;

    /// <summary>
    /// Which type of microphone/quality to access
    /// </summary>
    public StreamCategory Streamtype = StreamCategory.HIGH_QUALITY_VOICE;

    /// <summary>
    /// You can boost volume here as desired. 1 is default but probably too quiet. You can change during operation. 
    /// </summary>
    public float InputGain = 2;

    /// <summary>
    /// Whether or not to send the microphone data across the network
    /// </summary>
    public bool ShouldTransmitAudio = true;

    /// <summary>
    /// Whether other users should be able to hear the transmitted audio
    /// </summary>
    public bool Mute;

    public Transform GlobalAnchorTransform;

    public bool ShowInterPacketTime;
    private bool micStarted;

    public const int AudioPacketSize = 960;
    private CircularBuffer micBuffer = new CircularBuffer(AudioPacketSize * 10 * 2 * 4, true);
    private byte[] packetSamples = new byte[AudioPacketSize * 4];
    public int zeroes;
    private readonly Mutex audioDataMutex = new Mutex();

    private void Awake()
    {
        audioSource = GetComponent<AudioSource>();
        zeroes = 0;
        initAudio();
    }

    private void initAudio()
    {
        int errorCode = MicInitializeCustomRate((int)Streamtype, AudioSettings.outputSampleRate);
        if (errorCode == 0 || errorCode == (int)MicStream.ErrorCodes.ALREADY_RUNNING)
        {

            if (CheckForErrorOnCall(MicSetGain(InputGain)))
            {
                micStarted = CheckForErrorOnCall(MicStartStream(false, false));
            }
        }
    }

    private void OnAudioFilterRead(float[] buffer, int numChannels)
    {
        try
        {
            audioDataMutex.WaitOne();

            if (micStarted)
            {
                if (CheckForErrorOnCall(MicGetFrame(buffer, buffer.Length, numChannels)))
                {
                    int dataSize = buffer.Length * 4;

                    zeroes = buffer[0] == 0 ? zeroes + 1 : 0; 

                    if (micBuffer.Write(buffer, 0, dataSize) != dataSize)
                    {
                        Debug.LogError("Send buffer filled up. Some audio will be lost.");
                    }
                }
            }
        }
        catch (Exception e)
        {
            Debug.LogError(e.Message);
        }
        finally
        {
            audioDataMutex.ReleaseMutex();
        }
    }

    private void Update()
    {
        CheckForErrorOnCall(MicStream.MicSetGain(InputGain));

        try
        {
            audioDataMutex.WaitOne();

            while (micBuffer.UsedCapacity >= 4 * AudioPacketSize)
            {
                TransmitAudio();
            }
        }
        catch (Exception e)
        {
            Debug.LogError(e.Message);
        }
        finally
        {
            audioDataMutex.ReleaseMutex();
        }
    }

    private void TransmitAudio()
    {
        micBuffer.Read(packetSamples, 0, 4 * AudioPacketSize);

        // isMicConnected(packetSamples);
        ClientSend.PlayerAudio(packetSamples, packetSamples.Length);
    }

    private bool CheckForErrorOnCall(int returnCode)
    {
        return MicStream.CheckForErrorOnCall(returnCode);
    }

#if DOTNET_FX
    // on device, deal with all the ways that we could suspend our program in as few lines as possible
    private void OnApplicationPause(bool pause)
    {
        if (pause)
        {
            CheckForErrorOnCall(MicStream.MicPause());
        }
        else
        {
            CheckForErrorOnCall(MicStream.MicResume());
        }
    }

    private void OnApplicationFocus(bool focused)
    {
        OnApplicationPause(!focused);
    }

    private void OnDisable()
    {
        OnApplicationPause(true);
    }

    private void OnEnable()
    {
        OnApplicationPause(false);
    }
#endif
}

/// <summary>
/// Helper class for transmitting data over network.
/// </summary>
public class CircularBuffer
{
    public CircularBuffer(int size, bool allowOverwrite = false, int padding = 4)
    {
        data = new byte[size];
        readWritePadding = padding;
        this.allowOverwrite = allowOverwrite;
    }

    public int TotalCapacity
    {
        get
        {
            return data.Length - readWritePadding;
        }
    }

    public int UsedCapacity
    {
        get
        {
            if (writeOffset >= readOffset)
            {
                return writeOffset - readOffset;
            }
            int firstChunk = data.Length - readOffset;
            int secondChunk = writeOffset;
            return firstChunk + secondChunk;
        }
    }

    public void Reset()
    {
        readOffset = 0;
        writeOffset = 0;
    }

    public int Write(Array src, int srcReadPosBytes, int byteCount)
    {
        int maxWritePos;
        bool wrappedAround = writeOffset < readOffset;
        if (!wrappedAround)
        {
            maxWritePos = (readOffset != 0 || allowOverwrite) ? data.Length : data.Length - readWritePadding;
        }
        else
        {
            maxWritePos = allowOverwrite ? data.Length : readOffset - readWritePadding;
        }

        int chunkSize = Math.Min(byteCount, maxWritePos - writeOffset);
        int writeEnd = writeOffset + chunkSize;
        bool needToMoveReadOffset = wrappedAround ? writeEnd >= readOffset : (writeEnd == data.Length && readOffset == 0);
        if (needToMoveReadOffset)
        {
            if (!allowOverwrite)
            {
                throw new Exception("Circular buffer logic error. Overwriting data.");
            }
            readOffset = (writeEnd + readWritePadding) % data.Length;
        }

        Buffer.BlockCopy(src, srcReadPosBytes, data, writeOffset, chunkSize);
        writeOffset = (writeOffset + chunkSize) % data.Length;

        int bytesWritten = chunkSize;
        int remaining = byteCount - bytesWritten;
        if (bytesWritten > 0 && remaining > 0)
        {
            bytesWritten += Write(src, srcReadPosBytes + chunkSize, remaining);
        }

        return bytesWritten;
    }

    public int Read(Array dst, int dstWritePosBytes, int byteCount)
    {
        if (readOffset == writeOffset)
        {
            return 0;
        }

        int maxReadPos;
        if (readOffset > writeOffset)
        {
            maxReadPos = data.Length;
        }
        else
        {
            maxReadPos = writeOffset;
        }

        int chunkSize = Math.Min(byteCount, maxReadPos - readOffset);

        Buffer.BlockCopy(data, readOffset, dst, dstWritePosBytes, chunkSize);
        readOffset = (readOffset + chunkSize) % data.Length;

        int bytesRead = chunkSize;
        int remaining = byteCount - bytesRead;
        if (bytesRead > 0 && remaining > 0)
        {
            bytesRead += Read(dst, dstWritePosBytes + bytesRead, remaining);
        }

        return bytesRead;
    }

    private byte[] data;
    private int writeOffset;
    private int readOffset;

    private readonly int readWritePadding;
    private readonly bool allowOverwrite;
}

Expected behavior

The audio stream is consistently available, or gives an error message explaining why byteframes are empty

Screenshots

If applicable, add screenshots to help explain your problem.

Your setup (please complete the following information)

Target platform (please complete the following information)

david-c-kline commented 3 years ago

Thanks for the report bruhnth17! We will take a look.

By the way, does the example scene have similar problems?

bruhnth17 commented 3 years ago

Great thank you! Could you point me to the direction of the example scene for Microphone Streaming? I checked out the SpeechInputExample from the Input-Demos, and AudioLoFiEffectExamples & AudioOcclusionExamples from the Audio Demos. They all work fine, but I did not see the WindowsMicrophoneStream in them.

The example where I have the code from is from this "older" example for a voice chat that I found https://github.com/microsoft/MixedReality213/blob/b507f095f2bb8f238961cd223c0cde2be391f36e/Assets/HoloToolkit/Sharing/Scripts/VoiceChat/MicrophoneTransmitter.cs

david-c-kline commented 3 years ago

@bruhnth17, our example does not do streaming of the microphone data. I am interested in seeing if the demo (spatial mesh responding to your voice) has any of the same errors you are seeing in your application.

The demo is in MRTK\Examples\Demos\Audio\Scenes. Please use WindowsMicrophoneStreamDemo.unity

bruhnth17 commented 3 years ago

Alright thank you. I'll try out the WindowsMicrophoneStreamDemo this weekend and report back to you

bruhnth17 commented 3 years ago

WindowsMicrophoneStreamDemo.unity is not contained in the MRTK examples that I have in the project (2.6.1). When downloading the Audio examples through the package manager in Unity, I only get the two you see in the image below.

intalledMrtk no-stream-demo

private void OnAudioFilterRead(float[] buffer, int numChannels)
        {
            if (micStream == null) { return; }

            // Read the microphone stream data.
            WindowsMicrophoneStreamErrorCode result = micStream.ReadAudioFrame(buffer, numChannels);
            Debug.Log($"Read Audio Frame. {string.Join(", ", buffer)}");
// [...]

audioDebug

Tomorrow I'll deploy the original version of the example to the HL2 and see if that works

blingopen commented 3 years ago

I tried the demo scene, but it didn't work as expected. While I spoke several times, the mesh didn't change colour. BTW, I checked the script WindowsMicrophoneStream.cs and the code is grey(my OS is Win 10), do you know how to solve that?

bruhnth17 commented 3 years ago

@blingopen do you see a #if MICSTREAM_PRESENT at the top of the file? It is a custom #define - You can find more information about them here. If you remove the conditions at the top and the bottom of the file, the code will be highlighted as usual.

Because I have Mixed Reality Toolkit Microphone Stream Selector 1.0.0 installed, I thought I can remove it safely - But that might be a misconception (As I still can't get it to work).

blingopen commented 3 years ago

Thanks. @bruhnth17 I've tried it, but there is another problem. This is my issue #9779 . My Microphone Stream Selector didn't work. Is your Microphone Stream Selector working normally? If so, is there any setting I should do?

bruhnth17 commented 3 years ago

I left a comment in the issue. If I understand it right we have two different issues. You can't work with the WindowsMicrophoneStream, I can but the stream is emtpy

blingopen commented 3 years ago

yeah, I found you imported this package successfully, so wanna ask if there are other steps after importing this package via official Mixed Reality Feature Tool

david-c-kline commented 3 years ago

Question for folks. Are you using the Unity XR SDK pipeline or the built-in "legacy xr" one?

I have been able to repro what looks like this, in the following scenario:

I am using the prerelease of MRTK 2.7

david-c-kline commented 3 years ago

wanna ask if there are other steps after importing this package via official Mixed Reality Feature Tool

There are no other required steps. Just import the package into the project and use the API

david-c-kline commented 3 years ago

Marking external as the mic stream is managed in an external repository. Will continue investigating as a flaw in that component/script set.

bruhnth17 commented 3 years ago

I am using the XR-Plugin Management and MRTK 2.6.1

@davidkline-ms

I have been able to repro what looks like this, in the following scenario: [...]

Does that mean, you were able to reproduce the problem? Any workaround you can recommend at this time?

stale[bot] commented 2 years ago

This issue has been marked as stale by an automated process because it has not had any recent activity. It will be automatically closed in 30 days if no further activity occurs. If this is still an issue please add a new comment with more recent details and repro steps.

mattycorbett commented 2 years ago

I have the exact same problem. Was a solution ever found? Using MRTK 2.8.2.0 and Unity 2020.3.16f1. Is it related to this problem (at the end of the page)? @MaxWang-MS @davidkline-ms

mattycorbett commented 2 years ago

Anything you can offer about how to work around this? Since the first comment I've tried 2 (newer) versions of Unity. No change. I built a project from scratch just for the demo, still no change. Sometimes it never returns a value. Sometimes it begins correctly and then stops returning values and never starts again.

blingopen commented 1 year ago

Sorry, I gave up on this solution at the time and stopped development related to HoloLens, so I can't offer any help.

mattycorbett commented 1 year ago

For what its worth - I finally fixed it. The MicStream DLL's MediaCapture instance was conflicting with one I had already instantiated for photo captures. In short, you cant use MicStream with another MediaCapture instance. I tried to set the settings for SharingMode on the first MediaCapture (in my script for capturing photos), but this didnt work. I had to completely stop using the MicStream .dll and streamline the audio capture under one MediaCapture instantiated with StreamingCaptureMode.AudioAndVideo. This fixed the problem.

IssueSyncBot commented 8 months ago

We appreciate your feedback and thank you for reporting this issue.

Microsoft Mixed Reality Toolkit version 2 (MRTK2) is currently in limited support. This means that Microsoft is only fixing high priority security issues. Unfortunately, this issue does not meet the necessary priority and will be closed. If you strongly feel that this issue deserves more attention, please open a new issue and explain why it is important.

Microsoft recommends that all new HoloLens 2 Unity applications use MRTK3 instead of MRTK2.

Please note that MRTK3 was released in August 2023. It features an all-new architecture for developing rich mixed reality experiences and has a minimum requirement of Unity 2021.3 LTS. For more information about MRTK3, please visit https://www.mixedrealitytoolkit.org.

Thank you for your continued support of the Mixed Reality Toolkit!