Placeholder-Software / Dissonance

Unity Voice Chat Asset
71 stars 5 forks source link

Could provide sample codes using BaseMicrophoneSubscriber for recording mic and play it. #273

Closed jackyetz closed 1 year ago

jackyetz commented 1 year ago

I simply wanna record mic and play it with AEC function. Here, I established a GameObject with OfflineCommsNetwork, DisonanceComms, AudioSource, and MyScript where DisonanceComms followed the https://placeholder-software.co.uk/dissonance/docs/Tutorials/Acoustic-Echo-Cancellation.html , and MyScript is as follows. Can you help finish the following codes for returning the recording audioclip and playing it in AudioSource. I have read the instruction of https://placeholder-software.github.io/Dissonance/Tutorials/UsingIMicrophoneSubscriber.html , yet it is too simple.

public class RecordDissonance : BaseMicrophoneSubscriber
{
    public int samplerate = 16000;
    public float frequency = 440;
    public int position = 0;

    DissonanceComms DC;
    AudioSource audiosrc;
    AudioClip audioclip;
    Dissonance.Audio.Capture.IMicrophoneSubscriber IMS;
    ArraySegment<float> buffer;
    List<float> bufferlist = new List<float>();
    protected override void ProcessAudio(ArraySegment<float> data)
    {
        bufferlist.AddRange(data.Array);
    }

    protected override void ResetAudioStream(WaveFormat waveFormat)
    {
        AudioClip.Destroy(audioclip);
    }

    // Start is called before the first frame update
    void Start()
    {
        DC = GetComponent<DissonanceComms>();
        audiosrc = GetComponent<AudioSource>();
        audioclip = AudioClip.Create("MySinusoid", samplerate * 2, 1, samplerate, true, OnAudioRead, OnAudioSetPosition);
        audiosrc.clip = audioclip;
        audiosrc.Play();
    }
    void OnAudioRead(float[] data)
    {
        data = bufferlist.ToArray();
        bufferlist.Clear();
    }

    void OnAudioSetPosition(int newPosition)
    {
        position = newPosition;
    }
}
martindevans commented 1 year ago

Dissonance doesn't have an AudioClip containing the processed audio, if you want that you'll have to built it yourself out of the audio stream (which this is giving you access to).

You can use the AudioClip.Create overload that has a "PCMReaderCallback" callback - this is a function that Unity will call every time it wants more audio.

In ResetAudioStream you should cancel playback, destroy the audioclip and create a new one.

In ProcessAudio you should add the data to a buffer, and then supply that data later to Unity in the callback.

jackyetz commented 1 year ago

Dissonance doesn't have an AudioClip containing the processed audio, if you want that you'll have to built it yourself out of the audio stream (which this is giving you access to).

You can use the AudioClip.Create overload that has a "PCMReaderCallback" callback - this is a function that Unity will call every time it wants more audio.

In ResetAudioStream you should cancel playback, destroy the audioclip and create a new one.

In ProcessAudio you should add the data to a buffer, and then supply that data later to Unity in the callback.

Thank Martin. You replied so fast that help me a lot. I updated the code above. And yet not work. Could you help further more?

martindevans commented 1 year ago

As stated in the docs:

After this method has finished executing you must not hold any references to the data argument. Any data that you want to store for processing later must be copied out of the data.

So you can't put data into your bufferlist directly.

jackyetz commented 1 year ago

modified as follows? not work yet. I debugged yet found the "ProcessAudio" function never executed.

float[] tmpdata;
protected override void ProcessAudio(ArraySegment<float> data)
    {
        data.CopyTo(tmpdata);
        bufferlist.AddRange(tmpdata);
    }
martindevans commented 1 year ago

ah I apologise, I misread your code initially. Using AddRange is fine (that does a copy of the data internally). However, you should not add the entire Array - the point of ArraySegment is that it represents a segment of an array with some data it in. Instead you should copy in the data delimited by Offset and Count.

I debugged yet found the "ProcessAudio" function never executed.

Was ResetAudio ever called?

If neither is being called that implies either:

OnAudioRead

void OnAudioRead(float[] data)
{
    data = bufferlist.ToArray();
    bufferlist.Clear();
}

This is not correct. You need modify the data array to contain some data copied from your buffer and then remove that amount of data fro your buffer. Assigning to data does nothing here.

jackyetz commented 1 year ago

Was ResetAudio ever called?

If neither is being called that implies either:

  • Your class isn't registered with DissonanceComms.SubscribeToRecordedAudio.
  • Dissonance isn't recording any audio, because it's not in a network session.

Yes, my class isn't registered with DissonanceComms.SubscribeToRecordedAudio. Could u help give the detail code. I searched the whole site for the keyword "SubscribeToRecordedAudio", found none except BaseMicrophoneSubscriber. However, it s too simple to be confusing.

martindevans commented 1 year ago

Once you have your class you just need to add it to Dissonance like this:

DissonanceComms comms; // Get the comms object somehow
comms.SubscribeToRecordedAudio(new RecordDissonance());
jackyetz commented 1 year ago

Once you have your class you just need to add it to Dissonance like this:

DissonanceComms comms; // Get the comms object somehow
comms.SubscribeToRecordedAudio(new RecordDissonance());

Codes modified as follows. ProcessAudio and ResetAudio are still not executed.

public class RecordDissonance : MonoBehaviour
{
    public int samplerate = 16000;

    DissonanceComms DC;
    AudioSource audiosrc;
    AudioClip audioclip;
    MyMicrophoneSubscriber MS = new MyMicrophoneSubscriber();
    ArraySegment<float> buffer;
    List<float> bufferlist = new List<float>();

    // Start is called before the first frame update
    void Start()
    {
        DC = GetComponent<DissonanceComms>();
        audiosrc = GetComponent<AudioSource>();

        DC.SubscribeToRecordedAudio(MS);
        audioclip = AudioClip.Create("MySinusoid", samplerate * 2, 1, samplerate, true, MS.OnAudioRead, MS.OnAudioSetPosition);
        audiosrc.clip = audioclip;
        audiosrc.Play();
    }
}

public class MyMicrophoneSubscriber : BaseMicrophoneSubscriber
{
    public int position = 0;
    List<float> bufferlist = new List<float>();
    protected override void ProcessAudio(ArraySegment<float> data)
    {
        bufferlist.AddRange(data.ToArray());
    }

    protected override void ResetAudioStream(WaveFormat waveFormat)
    {
        bufferlist.Clear();
    }
    public void OnAudioRead(float[] data)
    {
        bufferlist.ToArray().CopyTo(data, 0);
        bufferlist.Clear();
    }

    public void OnAudioSetPosition(int newPosition)
    {
        position = newPosition;
    }
}
martindevans commented 1 year ago

OnAudioRead

public void OnAudioRead(float[] data)
{
    bufferlist.ToArray().CopyTo(data, 0);
    bufferlist.Clear();
}

This still isn't correct. After you have copied some amount of data from the bufferlist into the data array you need to remove exactly that much data from the start of the buffer.

public void OnAudioRead(float[] data)
{
    bufferlist.CopyTo(0, data, 0, data.Length);
    bufferlist.RemoveRange(0, data.Length);
}

> ProcessAudio and ResetAudio are still not executed.

In that case Dissonance must not be in an audio session (so it is not recording any audio). Check the DissonanceComms component inspector to see what it shows.

Should do it.

jackyetz commented 1 year ago

When running, inspectors for GameObject (w/ DissonanceComms), AEC, and myprefab are in following 4 pics. And a "Basic Microphone Capture" component comes out in the inspector of GameObject (w/ DissonanceComms), in which the value within dBFS bar is changing while I speak. screenshot DissonanceComms 1, screenshot DissonanceComms 2, AEC, my prefab.

jackyetz commented 1 year ago

I tried the Offline Demo Scene (in offline/demo/), and made a bit change by dragging myprefab (which is created by myself) to the Comms. Then AEC works in my phone (android). This raises following three questions: 1 The demo app never asks for Recording Permission when starting running it (in the phone with android system). I have to change permisions in the Setup pannel. It is not convenient while other apps always asks for permissions by self. 2 I cannot tell the difference between mine and the demo except for an additional "Loopback Audio" gameobject in the demo. Moreover, I still need help on my issues because I need capture the recording for further processing. 3 There is still a few echo. How to adjust it.

jackyetz commented 1 year ago

Using the following code I have the recording played. Saddly, the voice has a serious deformation. I can hardly identify single words. Moreover, The consumed memory keep rising.

public class RecordDissonance : MonoBehaviour
{
    public int samplerate = 48000;

    DissonanceComms DC;
    AudioSource audiosrc;
    AudioClip audioclip;
    MyMicrophoneSubscriber MS = new MyMicrophoneSubscriber();

    // Start is called before the first frame update
    void Start()
    {
        DC = GameObject.Find("DissonanceComms").GetComponent<DissonanceComms>();
        audiosrc = GetComponent<AudioSource>();

        DC.SubscribeToRecordedAudio(MS);
        StartCoroutine(WaitRemoteClip());
    }
    IEnumerator WaitRemoteClip()
    {
        while (true)
        {
            yield return audioclip = AudioClip.Create("MySinusoid", samplerate * 2, 1, samplerate, false, MS.OnAudioRead, MS.OnAudioSetPosition);
            audiosrc.clip = audioclip;
            audiosrc.PlayOneShot(audioclip);
        }
    }
    private void Update()
    {
        MS.Update();
    }
}

public class MyMicrophoneSubscriber : BaseMicrophoneSubscriber
{
    public int position = 0;
    List<float> bufferlist = new List<float>();
    protected override void ProcessAudio(ArraySegment<float> data)
    {
        bufferlist.AddRange(data.ToArray());
    }

    protected override void ResetAudioStream(WaveFormat waveFormat)
    {
        bufferlist.Clear();
    }
    public override void Update()
    {
        base.Update();
    }
    //After you have copied some amount of data from the bufferlist into the data array,
    // you need to remove exactly that much data from the start of the buffer.
    public void OnAudioRead(float[] data)
    {
        if (bufferlist.Count < data.Length) return;
        bufferlist.CopyTo(0, data, 0, data.Length);
        bufferlist.RemoveRange(0, data.Length);
    }

    public void OnAudioSetPosition(int newPosition)
    {
        position = newPosition;
    }
}
martindevans commented 1 year ago
  1. The demo app never asks for Recording Permission when starting running it

You need to use Permission.RequestUserPermission in a proper app (the demo doesn't do this).

  1. I cannot tell the difference between mine and the demo except for an additional "Loopback Audio" gameobject in the demo. Moreover, I still need help on my issues because I need capture the recording for further processing.

I don't quite understand the question, sorry.

  1. There is still a few echo. How to adjust it.

As you can see in your screenshot the echo cancellation system is still "initialising", it usually requires about 5-15 seconds to start itself. It's very important that there is some sound for it to process! It cannot initialise if there is dead silence, or just occasional sounds. Consider adding some background music, or sound effects (e.g. a "joined call" jingle) to give the AEC something to work with.

...code...

while (true) { /* create clip */ }

This loop creates a new clip every single frame and never destroys it! That's why your memory consumption is constantly increasing!

jackyetz commented 1 year ago

Given the code in my reply, the captured audioclip played deformated voice. I can tell it is my voice but the deformation is too serious to identify single words. What is the problem raising such deformation?

martindevans commented 1 year ago

I think constantly making and playing a new clip will deform the voice, (since each each clip will not perfectly connect to the next clip).

jackyetz commented 1 year ago

I think constantly making and playing a new clip will deform the voice, (since each each clip will not perfectly connect to the next clip).

Is there other way to realize AEC recording? There is not ACE asset except yours. Could you extract the AEC function to make a single asset that applied to the user-created recording. I would like to buy the first one.

martindevans commented 1 year ago

Constantly creating a new clip is not necessary.

Instead you should set the stream parameter to true when you call AudioClip.Create, this means Unity will constantly keep calling your OnAudioRead method to fetch more audio when it needs it. Then you can just have one single clip.

jackyetz commented 1 year ago

Constantly creating a new clip is not necessary.

Instead you should set the stream parameter to true when you call AudioClip.Create, this means Unity will constantly keep calling your OnAudioRead method to fetch more audio when it needs it. Then you can just have one single clip.

I rewrite the code as follows. But the playback voice is delayed and the phonation speed is much slow. I conjecture it is raised by the wrong samplerate. any suggest? In my code, the samplerate=48000.

void Start()
    {
        comms = GameObject.Find("DissonanceComms").GetComponent<DissonanceComms>();
        audiosrc = GetComponent<AudioSource>();

        comms.SubscribeToRecordedAudio(MicS);
        StartCoroutine(WaitRemoteClip());
    }
    IEnumerator WaitRemoteClip()
    {
        yield return audioclip = AudioClip.Create("MySinusoid", 4096, 1, samplerate, true, MicS.OnAudioRead, MicS.OnAudioSetPosition);
        audiosrc.clip = audioclip;
        audiosrc.loop = true;
        audiosrc.Play();
    }
martindevans commented 1 year ago

I've just spent quite a lot of time debugging this, and unfortunately I think it might be hitting a bug in Unity.

The entire script is here. A few important changes:

However this does not work. I spent a long time trying to work out why. Eventually I tried the sample code from the Unity documentation here to play a simple sine wave... this doesn't work! The pitch is irregular and the audio stream is constantly reset for some reason. Unfortunately I don't think I can fix or workaround that, AudioClip.Create seems to be broken. It's lucky that we don't use that in Dissonance itself!

This means getting an AudioClip with the data is going to be difficult/impossible. What exactly are you trying to achieve? Maybe I can think of another way to do it.

jackyetz commented 1 year ago

This means getting an AudioClip with the data is going to be difficult/impossible. What exactly are you trying to achieve? Maybe I can think of another way to do it.

Thank Martin very much. very much appreciate your guidence and help.

Actually, I need AEC recording, in the format of AudioClip or float[] is welcome. And play the recording data in local device & send it to a remote server for Automatic Speech Recognition. Any suggest for that? If playing in local device is not availabel, just sending to remote server is alternative.

martindevans commented 1 year ago

If you just need the raw audio as float[] you can probably use the data given to you in ProcessAudio(ArraySegment<float> data).

At the moment you're building the raw data up into the bufferlist. Instead of packing this data into an AudioClip, can you send it to your server directly instead?

jackyetz commented 1 year ago

Hahaha. The playing is normal and the AEC works. Seems I did nothing more yet it works. There is a final issue, that in the android cellphone, having background noise occasionally. The noise is sometimes less-frequent, and sometimes heavily frequent. Is there any suggest relieving it?

The following is the last code. hoping to help people in need.

using System.Collections;
using System.Collections.Generic;
using System;
using UnityEngine;
using Dissonance;
using NAudio.Wave;
using System.Linq;

public class RecordDissonance : MonoBehaviour
{
    public const int samplerate = 48000;

    DissonanceComms comms;
    AudioSource audiosrc;
    AudioClip audioclip;
    MyMicrophoneSubscriber MicS = new MyMicrophoneSubscriber();

    // Start is called before the first frame update
    void Start()
    {
        comms = GameObject.Find("DissonanceComms").GetComponent<DissonanceComms>();
        audiosrc = GetComponent<AudioSource>();

        comms.SubscribeToRecordedAudio(MicS);
        StartCoroutine(WaitRemoteClip());
    }
    IEnumerator WaitRemoteClip()
    {
        audioclip = AudioClip.Create("MySinusoid", 4096, 1, samplerate, true, MicS.OnAudioRead, MicS.OnAudioSetPosition);
        audiosrc.clip = audioclip;
        audiosrc.loop = true;
        audiosrc.Play();

        yield return null;
    }
    private void Update()
    {
        MicS.Update();
    }
}

public class MyMicrophoneSubscriber : BaseMicrophoneSubscriber
{
    List<float> bufferlist = new List<float>();
    int minlength;

    protected override void ProcessAudio(ArraySegment<float> data)
    {
        Debug.Log("ProcessAudio: " + data.Count.ToString());
        bufferlist.AddRange(data.ToArray());
    }

    protected override void ResetAudioStream(WaveFormat waveFormat)
    {
        if (waveFormat.SampleRate != RecordDissonance.samplerate)
            throw new NotImplementedException("Wrong sample rate");
        Debug.Log("ResetAudioStream");
        bufferlist.Clear();
    }
    public override void Update()
    {
        base.Update();
    }
    //After you have copied some amount of data from the bufferlist into the data array,
    // you need to remove exactly that much data from the start of the buffer.
    public void OnAudioRead(float[] data)
    {
        Debug.Log("OnAudioRead: " + bufferlist.Count.ToString() + ", " + data.Length.ToString());
        //Debug.Log("OnAudioRead: ");
        minlength = (bufferlist.Count < data.Length) ? bufferlist.Count : data.Length;
        if (minlength > 0)
        {
            bufferlist.CopyTo(0, data, 0, minlength);
            bufferlist.RemoveRange(0, minlength);
        }
        else
            Array.Clear(data, 0, data.Length);
    }

    public void OnAudioSetPosition(int newPosition)
    {
        Debug.Log("OnAudioSetPosition: "+ newPosition.ToString());
    }
}
martindevans commented 1 year ago

There is a final issue, that in the android cellphone, having background noise occasionally.

It depends on exactly what kind of noise.