Closed achimmihca closed 4 months ago
Question: Should the amplification and noise suppression be per-channel or per-device?
I think per-channel would be better because it allows for more fine grained control. For example, one could have one very good mic (which does not need noise suppression) and another pretty bad mic (which does need noise suppression) connected to the same sound card.
I'd vote for per device, not per channel, to reduce configuration overhead. In most (if not all) cases, the same type of microphones will be used on one soundcard, so this assumption should be fine. Not having just one amplification/noise-suppression value for all devices is mostly required for use cases where microphones of different types are connected (for example singstar microphones which are very directional, and Nintendo Wii microphones, which are almost not directional at all, thus pick up much more noise or nearby players). Also, on Windows and macOS, you can only configure device sensitivity per input device, not per input channel.
Choose whatever you prefer. We'll probably have to fine-tune the settings later on anyways, when there is enough user-feedback available to analyze.
by the way, most implementations that I know of don't write each sample to the buffer alternating by channel, but instead write sampleCount of channel 1 to the buffer, then sampleCount of chanel 2 to the buffer, and so on. I guess this choice makes sense so buffer/array-copying is cheaper and with less overhead.
After a few hours of trying but failing, it turns out, there is no way to get more than just the first audio channel from API which is documented and part of UnityEngine. I did not find any cross-platform way to access microphone input devices that would work for all our target platforms. There exist cscore and naudio for accessing audio input devices + channel audio data on windows. For iOS we would have to use OpenSL ES. For Android... well, apparently there is no way to get audio input from more than one device at a time and for that device you also can only get data for the first channel - only workaround would be to directly fetch the RAW USB interface data and then try to write our own ASIO driver in C#/mono managed code - which would probably be too slow. For Linux/MacOS, there exist various other wrappers for PortAudio or ALSA or others, and in various incompatible versions.
The AudioClip created using UnityEngine.Microphone.Start(MicProfile.Name, true, 1, SampleRateHz);
always has a channels count of exactly 1.
There exist bunch of libraries to work around the problem, but they all either do not have a good fitting license, bring lots of overhead, have bad latency, do not support recording from different soundcards at the same time, are abandoned or require lots of work to integrate. FMOD api seems rather promising, should be easy to use, supports all our target platforms and the license would not really be a problem: freeware (not open-source) for non-profit projects, requires a paid license for commercial projects. But it can do much more than what we need and thus might be too heavyweight. PulseAudio: only macOS/Windows/linux, wrappers for C# are mostly out of date or abandoned; a minified wrapper just for audio input devices with multichannel support should be small enough to be maintainable. We could use the Unity mono microphone code for platforms not supported by pulseaudio.
Maybe Native Audio Plugin SDK from Unity could help making it easier: https://docs.unity3d.com/Manual/AudioMixerNativeAudioPlugin.html
Do mobile devices (Android, iOS) have mics with multiple channels? I dont think so (at least not until an external mic is connected). (In general I must admit, I am unsure how you imagine playing UltraStar on a mobile device. It does not seem appealing to me to play the game on such a device.)
Similarly, I don't think there are mics with multiple channels for game consoles. I would expect the only mics for these devices to be for SingStar-like games, not music creation.
Thus, I would expect the Unity API to be sufficient for mobile and game consoles. Windows/Linux/Mac are the only platforms where I would expect mics with multiple channels.
Android: only one channel and one device up to Android 7. Android 8+ can support multiple channels and multiple devices iOS: supports multiple channels Nintendo Switch: supports some USB microphones like singstar PlayStation: supports multiple channels for multiple devices XBox: supports multiple channels for multiple devices, usually already has the kinect multichannel microphone.
Playing UltraStar on Android is probably going to be a huge chunk on the games use. Don't forget by far most devices currently sold to consumers run Android: Android TVs, Tablets, Android Car, Smartphones (we might want to offer increase font size and font area for smaller sized screens).
Anyways, yes, for the first couple of versions of UltraStar Play, only supporting multiple audio channels on Windows/Linux/Mac should be ok. Later on, more developers might join the project, so there is something they could work on 😝
I have a prototype that uses PortAudio in Unity to record and play back mic samples.
PortAudio could enable to use specific mic channels on desktop platforms. This includes SingStar Wireless mics.
Furthermore, the latency of PortAudio could be lower compared to Unity API.
Plus, PortAudio can also play back the recorded audio with minimum latency. This could enable to amplify a players voice over the speaker, like it is done in traditional karaoke.
Good news! Looking forward to testing the PortAudio implementation!
I implemented this successfully for the upcoming Steam release. For now, I did the implementation in a private repository.
Will this feature be also implemented in the Open Source version?
Will this feature be also implemented in the Open Source version?
Yes, that's the idea.
However, I would like to first get compensation for my works. Therefor, I currently target a Steam release because it seems most promising to me. Alternatives could be crowd funding or a bounty approach
Plus, PortAudio can also play back the recorded audio with minimum latency. This could enable to amplify a players voice over the speaker, like it is done in traditional karaoke.
I tested this and the latency is still too high. I guess there is a reason why there is not a bunch of apps for this (on Windows). Near-zero latency requires a solution on hardware or at least driver level.
Thus, if you want to hear your singing from the speakers, I recommend using an audio interface that allows to output the input directly .
Any update on when the steam version will be shipped? I'm still waiting to play with two microphones...
Any update on when the steam version will be shipped? I'm still waiting to play with two microphones...
It actually is released (as early access) since a few days.
Thanks for the quick reply. But I need a version for Mac which exists in this repository.
I expect a MacOS release of Melody Mania to take much more time. When I have all planned features ready (including online multiplayer and modding support), then I will focus on other platforms. I just cannot maintain multiple platforms at the same time while developing new stuff.
Actual behaviour
At the moment, the recording options do not allow to select which channel of a device should be used for recording.
Expected behaviour
The recroding options should allow to select a specific channel of a recording device.
Implementation notes
I think
AudioClip.channels
should contain the number of channels for the recording device. The AudioClip is created using the Microphone API of Unity.In the MicrophonePitchTracker the call to
micAudioClip.GetData(MicData, currentSamplePosition)
fills a float array with the samples of the recording. I expect that the samples of all channels are in this float array alternating. For example, when there are three channels, I would expect the float array to contain the data for the channels 0, 1, and 2 as follows: [0,1,2,0,1,2,0,1,2,...]