filoe / cscore

An advanced audio library, written in C#. Provides tons of features. From playing/recording audio to decoding/encoding audio streams/files to processing audio data in realtime (e.g. applying custom effects during playback, create visualizations,...). The possibilities are nearly unlimited.
Other
2.2k stars 454 forks source link

How to record the mic to left channel and speakers to right channel. #391

Closed BerBevans closed 4 years ago

BerBevans commented 5 years ago

Just in case anyone else is looking to do this, I have found how to do it and the solution is posted below.

I will start this question by saying that I am very new (started today) with audio manipulation so I apologise in advance if this is an obvious misunderstanding from my side.

I need to write a utility that will record the microphone and the speakers to the same wav file but the wave sources need to be on different channels. E.g. Mic on the left channel, speakers on the right. The reason being is we are recording phone calls and analysing the content through speech recognition. Using the staff member on the left channel and the customer on the right, we can establish who is talking.

Using two of your sample projects and some code posted in https://github.com/filoe/cscore/issues/335 I have come up with the following example which will mix both wave sources and record to a file, but the same audio is recorded to each channel effectively making it a 2 channel mono file. Can you please point me in the right direction to be able to correctly record the sources to separate channels.

Thank you in advance.

Ben

` // Most of this taken from sample code private void StartCapture(string fileName) { if (SelectedDevice == null) return;

        // configure speaker capture
        _speakersIn = new WasapiLoopbackCapture();
        _speakersIn.Device = SelectedDevice;
        _speakersIn.Initialize();

        // configure mic capture (system default)
        _micIn = new WasapiCapture(true, AudioClientShareMode.Shared, 30);
        _micIn.Initialize();

        // assign sound inputs to sources
        _speakersInSource = new SoundInSource(_speakersIn);
        _micInSource = new SoundInSource(_micIn);

        // Set a constant sampling rate so all samples are the same.
        const int mixerSampleRate = 8000; //44.1kHz

        _mixer = new SimpleMixer(2, mixerSampleRate) //output: stereo, 44,1kHz
        {
            FillWithZeros = false,
            DivideResult = false //you may play around with this
        };

        // Used later as output variables
        VolumeSource volumeSource1, volumeSource2;

        var monoToLeftOnlyChannelMatrix = new ChannelMatrix(ChannelMask.SpeakerFrontCenter,
          ChannelMask.SpeakerFrontLeft | ChannelMask.SpeakerFrontRight);
        var monoToRightOnlyChannelMatrix = new ChannelMatrix(ChannelMask.SpeakerFrontCenter,
            ChannelMask.SpeakerFrontLeft | ChannelMask.SpeakerFrontRight);

        // used to set the volume on left channel to 100%, right to 0%
        monoToLeftOnlyChannelMatrix.SetMatrix(
           new[,]
           {
                {1.0f, 0.0f}
           });

        // used to set the volume on left channel to 0%, right to 100%
        monoToRightOnlyChannelMatrix.SetMatrix(
            new[,]
            {
                {0.0f, 1.0f}
            });

        // Add speakers to mixer
        _mixer.AddSource(
                _speakersInSource
                .ChangeSampleRate(mixerSampleRate)  // standardise sample rate
                .ToMono()  // Convert to mono ready for re-sampling
                .AppendSource(x => new DmoChannelResampler(x, monoToLeftOnlyChannelMatrix, mixerSampleRate)) // Assign to left channel
                .AppendSource(x => new VolumeSource(x.ToSampleSource()), out volumeSource1));  // assign volume source so volume can be altered.

        // add mic to mixer
        _mixer.AddSource(
                _micInSource
                .ChangeSampleRate(mixerSampleRate)
                .ToMono()
                .AppendSource(x => new DmoChannelResampler(x, monoToRightOnlyChannelMatrix, mixerSampleRate))
                .AppendSource(x => new VolumeSource(x.ToSampleSource()), out volumeSource2));

        // adjust the volume of the input signals(default value is 100 %):
        volumeSource1.Volume = 0.5f; //set the volume to 50%
        volumeSource2.Volume = 0.7f; //set the volume to 70%

        // prepare the wave file to be written
        var singleBlockNotificationStream = new SingleBlockNotificationStream(_mixer);
        _finalSource = singleBlockNotificationStream.ToWaveSource();
        _writer = new WaveWriter(fileName, _finalSource.WaveFormat);

        byte[] buffer = new byte[_finalSource.WaveFormat.BytesPerSecond / 2];
        _speakersInSource.DataAvailable += (s, e) =>
        {
            int read;
            while ((read = _finalSource.Read(buffer, 0, buffer.Length)) > 0)
                _writer.Write(buffer, 0, read);
        };

        singleBlockNotificationStream.SingleBlockRead += SingleBlockNotificationStreamOnSingleBlockRead;

        // start listening to the speakers and mic
        _speakersIn.Start();
        _micIn.Start();
    }

`

BerBevans commented 5 years ago

I have found a way to do this. Basically I mix two stereo sources from microphone and speakers but before mixing the two sources, I pan the audio on each source so the audio is effectively coming from one channel on each source but opposite channels so the channel with audio on one source is mixed with the channel with no audio from the other source.

Note There are two issues that I haven't ironed out but I suspect that they are related.

  1. The recorded audio is a bit choppy. This seems to be coming from the speaker capture. I am not sure if this is because all of the re-sampling is done real-time and then output to the one, pre-mixed file than recorded to separate files for each source and then after the recording has finished, mix both files into the one mixed file.
  2. The prompt to start writing to a file is from the speaker source detecting audio. If there is no sound playing from the speakers, the file will not start writing to the disk. This could cause issues with the microphone filling buffers. Having two IWaveSources linked to the same WasapiLoopbackCapture object may be causing the choppy audio mentioned in point 1.

Without further ado, here is my code. To get this working, I modified the 'Recorder' sample project.

// SimpleMixer.cs (copied from the 'SimpleMixerSample' project with the line to remove sources commented out.


using System;
using System.Collections.Generic;
using CSCore;

namespace Recorder
{
    public class SimpleMixer : ISampleSource
    {
        private readonly WaveFormat _waveFormat;
        private readonly List<ISampleSource> _sampleSources = new List<ISampleSource>();
        private readonly object _lockObj = new object();
        private float[] _mixerBuffer;

        public bool FillWithZeros { get; set; }

        public bool DivideResult { get; set; }

        public SimpleMixer(int channelCount, int sampleRate)
        {
            if(channelCount < 1)
                throw new ArgumentOutOfRangeException("channelCount");
            if(sampleRate < 1)
                throw new ArgumentOutOfRangeException("sampleRate");

            _waveFormat = new WaveFormat(sampleRate, 32, channelCount, AudioEncoding.IeeeFloat);
            FillWithZeros = false;
        }

        public void AddSource(ISampleSource source)
        {
            if (source == null)
                throw new ArgumentNullException("source");

            if(source.WaveFormat.Channels != WaveFormat.Channels ||
               source.WaveFormat.SampleRate != WaveFormat.SampleRate)
                throw new ArgumentException("Invalid format.", "source");

            lock (_lockObj)
            {
                if (!Contains(source))
                    _sampleSources.Add(source);
            }
        }

        public void RemoveSource(ISampleSource source)
        {
            //don't throw null ex here
            lock (_lockObj)
            {
                if (Contains(source))
                    _sampleSources.Remove(source);
            }
        }

        public bool Contains(ISampleSource source)
        {
            if (source == null)
                return false;
            return _sampleSources.Contains(source);
        }

        public int Read(float[] buffer, int offset, int count)
        {
            int numberOfStoredSamples = 0;

            if (count > 0 && _sampleSources.Count > 0)
            {
                lock (_lockObj)
                {
                    _mixerBuffer = _mixerBuffer.CheckBuffer(count);
                    List<int> numberOfReadSamples = new List<int>();
                    for (int m = _sampleSources.Count -1; m >= 0; m--)
                    {
                        var sampleSource = _sampleSources[m];
                        int read = sampleSource.Read(_mixerBuffer, 0, count);
                        for (int i = offset, n = 0; n < read; i++, n++)
                        {
                            if (numberOfStoredSamples <= i)
                                buffer[i] = _mixerBuffer[n];
                            else
                                buffer[i] += _mixerBuffer[n];
                        }
                        if (read > numberOfStoredSamples)
                            numberOfStoredSamples = read;

                        if (read > 0)
                            numberOfReadSamples.Add(read);
                        else
                        {
                            //raise event here
                            // RemoveSource(sampleSource); //remove the input to make sure that the event gets only raised once.
                            // Commented out to stop source being removed
                        }
                    }

                    if (DivideResult)
                    {
                        numberOfReadSamples.Sort();
                        int currentOffset = offset;
                        int remainingSources = numberOfReadSamples.Count;

                        foreach (var readSamples in numberOfReadSamples)
                        {
                            if (remainingSources == 0)
                                break;

                            while (currentOffset < offset + readSamples)
                            {
                                buffer[currentOffset] /= remainingSources;
                                buffer[currentOffset] = Math.Max(-1, Math.Min(1, buffer[currentOffset]));
                                currentOffset++;
                            }
                            remainingSources--;
                        }
                    }
                }
            }

            if (FillWithZeros && numberOfStoredSamples != count)
            {
                Array.Clear(
                    buffer,
                    Math.Max(offset + numberOfStoredSamples - 1, 0),
                    count - numberOfStoredSamples);

                return count;
            }

            return numberOfStoredSamples;
        }

        public bool CanSeek { get { return false; } }

        public WaveFormat WaveFormat
        {
            get { return _waveFormat; }
        }

        public long Position
        {
            get { return 0; }
            set
            {
                throw new NotSupportedException();
            }
        }

        public long Length
        {
            get { return 0; }
        }

        public void Dispose()
        {
            lock (_lockObj)
            {
                foreach (var sampleSource in _sampleSources.ToArray())
                {
                    sampleSource.Dispose();
                    _sampleSources.Remove(sampleSource);
                }
            }
        }
    }
}

//MainWindow.cs // Created a new enum (bottom of the file) public enum PanToChannel { Left = 1, Right = -1, Equal = 0 }

// Changed the global variables


        //Change this to CaptureMode.Capture to capture a microphone,...
        private const CaptureMode CaptureMode = Recorder.CaptureMode.LoopbackCapture;

        private MMDevice _selectedDevice;
        private WasapiCapture _speakersIn;
        private WasapiCapture _micIn;        
        private IWaveSource _speakersInSource;
        private IWaveSource _micInSource;        
        private PanSource _speakerPan;
        private PanSource _micPan;
        private SimpleMixer _mixer;              
        private IWaveSource _finalSource;
        private IWriteable _wavWriter;
        private readonly GraphVisualization _graphVisualization = new GraphVisualization();

// Updated 'StopCapture()' to handle new and renamed global valiables.

        private void StopCapture()
        {
            if (_speakersIn != null)
            {
                // clean up all global properties
                _speakersIn.Stop();
                _speakersIn.Dispose();
                _speakersIn = null;

                if (_micIn != null)
                {
                    _micIn.Stop();
                    _micIn.Dispose();
                    _micIn = null;
                }

                if (_speakersInSource != null)
                    _speakersInSource.Dispose();

                if (_micInSource != null)
                    _micInSource.Dispose();

                if (_speakerPan != null)
                    _speakerPan.Dispose();

                if (_micPan != null)
                    _micPan.Dispose();

                if (_mixer != null)
                    _mixer.Dispose();

                if (_finalSource != null)
                    _finalSource.Dispose();                    

                if (_wavWriter is IDisposable)
                    ((IDisposable)_wavWriter).Dispose();

                btnStop.Enabled = false;
                btnStart.Enabled = true;
            }
        }

// Updated 'StartCapture(string fileName)' and created some extra functions


private void StartCapture(string fileName)
        {
            if (SelectedDevice == null)
                return;

            SetupInputDevices();

            #region set Pan sources
            // All sources need to be the same sample rate so take the speaker sample rate which is generally higher.
            int speakerSampleRate = _speakersInSource.WaveFormat.SampleRate;

            // Just in case speakers are recording a stereo sound, merge left and right channels so all sound is heard when the source is panned to one side.
            _speakerPan = ConfigurePanSource(_speakersInSource, speakerSampleRate, true, PanToChannel.Left);
            _micPan = ConfigurePanSource(_micInSource, speakerSampleRate, false, PanToChannel.Right);
            #endregion

            #region mix wave sources together
            // mix both sources into one
            MixMultipleSources(new List<ISampleSource>() { _speakerPan }, speakerSampleRate);
            MixMultipleSources(new List<ISampleSource>() { _speakerPan, _micPan }, speakerSampleRate);
            #endregion

            #region read from wave streams and write to file
            // need to create this sound in source so the 'DataAvailable' event will fire and start writing to file.
            var speakerInSource = new SoundInSource(_speakersIn);
            var singleBlockNotificationStream = new SingleBlockNotificationStream(_mixer);
            _finalSource = singleBlockNotificationStream.ToWaveSource();
            _wavWriter = new WaveWriter(fileName, _finalSource.WaveFormat);

            byte[] buffer = new byte[_finalSource.WaveFormat.BytesPerSecond / 2];
            speakerInSource.DataAvailable += (s, e) =>
            {
                int read;
                while ((read = _finalSource.Read(buffer, 0, buffer.Length)) > 0)
                    _wavWriter.Write(buffer, 0, read);
            };
            #endregion

            // populate visulaion tool
            singleBlockNotificationStream.SingleBlockRead += SingleBlockNotificationStreamOnSingleBlockRead;

            // start the sound input sources
            _speakersIn.Start();
            _micIn.Start();
        }

        private void SetupInputDevices()
        {
            _speakersIn = new WasapiLoopbackCapture();
            _speakersIn.Device = SelectedDevice;
            _speakersIn.Initialize();

            // configure mic capture (system default)
            _micIn = new WasapiCapture(true, AudioClientShareMode.Shared, 30);
            _micIn.Initialize();

            // Assign wave sources
            _speakersInSource = new SoundInSource(_speakersIn) { FillWithZeros = false };
            _micInSource = new SoundInSource(_micIn) { FillWithZeros = false };
        }

        private PanSource ConfigurePanSource(IWaveSource inputSource, int sampleRate, bool makeLeftAndRightChannelsEqualBeforePan, PanToChannel panDirection)
        {
            if (sampleRate != inputSource.WaveFormat.SampleRate)
                inputSource = inputSource.ChangeSampleRate(sampleRate);

            if (makeLeftAndRightChannelsEqualBeforePan)
                inputSource = inputSource.ToMono().ToStereo();

            var panSource = new PanSource(inputSource.ToSampleSource());

            panSource.Pan = (int)panDirection;

            return panSource;
        }

        private void MixMultipleSources(List<ISampleSource> waveSources, int sampleRate)
        {
            _mixer = new SimpleMixer(2, sampleRate) //output: stereo, 44,1kHz
            {
                FillWithZeros = false,
                DivideResult = true // you may play around with this
            };

            foreach(var source in waveSources)
            {
                _mixer.AddSource(source);
            }
        }
oo-dev17 commented 3 years ago

Hi @BerBevans , thanks for your nice code, I hope I can learn from it. Do you think the SimpleMixer, like you modified it, is suitable for mixing 2 already existing wav files (that I recorded in advance). I wonder about the part where you start the mixer (with the comment "need to create this sound in source so the 'DataAvailable' event will fire...."). Thank you in advance!

BerBevans commented 3 years ago

Hi @BerBevans , thanks for your nice code, I hope I can learn from it. Do you think the SimpleMixer, like you modified it, is suitable for mixing 2 already existing wav files (that I recorded in advance). I wonder about the part where you start the mixer (with the comment "need to create this sound in source so the 'DataAvailable' event will fire...."). Thank you in advance!

You will need to forgive me. It feels like a lifetime since I was playing with this code, but I don't see why not. If I remember correctly the existing wav file is just an IWaveSource so you would be using the code to record 1 wav file to the left speaker and the other file to the right speaker.

filoe commented 3 years ago

The principle is always the same. When recording, the data gets pushed into the SoundInSource. When you've got already some existing files without recording, just pull the audio data through the chain. It is basically what line while ((read = _finalSource.Read(buffer, 0, buffer.Length)) > 0) does. Just read from the finalSource as long as Read returns something greater than zero and write the data in the buffer to any file.

oo-dev17 commented 3 years ago

I thank you both a lot for the quick response! Florian, that sounds promising, but I have an issue before I reach there:

First I want to record mic and speaker separately and later mix them. So I need 2 files that have the same start time. Edit: So I enabled FillWithZeros on the soundInSource in your "Recorder app" from 'Samples', but this gives me huge files (some GB within seconds 😮)

(Removed my code)

If I don't enable FillWithZeros, it works as expected, but writes only in to WAV when a sound is played.... (which is not suitable for me) Any idea?

oo-dev17 commented 3 years ago

I also tried BerBevans original code but I in the mix the "speaker" capturing is stuttering. While a pure speaker capturing has a flawless sound (but starts to 'late', not before a app plays a sound). @filoe , since I cannot enable "FillWithZeros" in the event driven capturing, I am wondering what is the loop-callback capturing that you mention in the 'features' of CScore? (I don't mean 'loopback' capturing)

filoe commented 3 years ago

Loopback means it captures the played sound of any output device. It won't fit your needs. In general: The FillWithZeros property has the following effect: If anyone requests data by calling the Read method, the source will try to return the requested amount of data. If there is not enough data which could be returned, it will fill up the rest of the requested data with zeros. Now if you keep calling Read in a loop, there won't be enough data and it will keep returning all the requested data filled with zeros. The result is a huge file containing only zeros. The reason FillWithZeros exist is, that if you want to playback the source and the requested amount of data is not available, the playback wills top. The FillWithZeros property fills up the missing part of data with silence and prevents the playback from stopping.

Please tell me exactly what you are trying to achieve. What kind of sources do you want to mix and write them to a file? Are you trying to record a microphone or an output device?

EDIT: Just read that you were asking for loop-callback not loopback capture. When playing or recording audio, you have to interact with the soundcard. In the case of a playback, you have to serve audio data to the soundcard in a certain interval so that the playback won't stop. You've got two ways when using wasapi interfaces: 1) event-callback: blocking the task and waiting for EventHandle to fire up or 2) looping in certain interval (latency), check if new data needs to be served -> serve data.

oo-dev17 commented 3 years ago

I don't mean "Loopback recording" but "Loop callback instead of event callback" as you mentioned in the "Features" section, image

but here is my problem...

The title of this thread describes it quite well: In the end I need a WAV with the mixed (L/R panned) content of 2 specified audio sources (1 capture device and 1 render device -> mic and speaker -> aka 1 WasapiCapture and 1 WasapiLoopbackCapure, right?). I would be happy with either solution: Direct mixing of live data OR first recording separately and mix 2 files later.

But the "direct mix" (lets say approach 'A' - according to the code above with the 'SimpleMixer') gives me good sound on the mic device but stuttering sound on the output device (the usefull bytes in the resulting wav are periodically interrupted by a bunch of zero values). And this happens, no matter if I use the sample rate from the mic or the sample rate from speaker as common rate for the mixer. The frequencies of the audios are okay, not pitched.

Approach 'B': capturing like your "Recorder" sample x 2, to mix them later, ... gives good sound, but the speaker recording file starts not before any application plays a sound, so I cannot mix them later in sync.

Approach 'C': Enabling 'FillWithZeros' in approach 'B' to avoid the start-gap, produces the huge files.

Thanks a lot for your effort!

(just tell me if you need the code lines or a dummy project/solution)

EDIT: @filoe ... Now I see your 'edit' about loop-callback 😊 ... if useful to try it out: is there any sample code about this method of capturing?

oo-dev17 commented 3 years ago

I see a workaround for approach 'B': I would produce continuously a very silent tone with your SineGenerator on the speaker/capture device, so hopefully the DataAvailableEvent fires continuously - and eventually I have 2 files that I can mix in sync afterwards ... -> Seems to work 😎