bmc0 / dsp

An audio processing program with an interactive mode.
ISC License
219 stars 31 forks source link

Feature request - Ambiophonics effect #54

Open aqxa1 opened 4 years ago

aqxa1 commented 4 years ago

Ambiophonics is a method of sound reproduction that involves crosstalk cancellation between 2 or 4 speakers (Panambiophonics), in order to result in more realistic imaging including a wider soundstage, and a stronger central image (no phantom image). Unlike other methods, this is intended to be used with existing stereo and multichannel material. This effect can be achieved in a passive way by positioning the two speakers with a separation of less than 30 degrees, and separating them by an acoustic barrier. Obviously, this is rather impractical for most, so this effect can also be achieved through DSP combined with the appropriate level of speaker separation (< 30 degrees).

Consequently, Recursive Ambiophonic Crosstalk Elimination (RACE) was invented. Here's a block diagram that can likely explain it better than I can: Block diagram of RACE

And an extract from this paper Glasgal, 2007:

The basic crosstalk cancelling technique the Ambiophonic team has developed (and are making available free to the audio community) is Recursive Ambiophonic Crosstalk Elimination or RACE. Recursive is the operating word. When a signal from the left speaker undesirably reaches the right ear, it must be cancelled acoustically at that ear by an inverted, slightly delayed, slightly lower level replica from the right speaker. But this cancellation signal will also then journey on to the left ear and so it must also be cancelled (2nd order cancellation) by a properly conditioned signal from the left speaker, which signal then also reaches the right ear requiring another round (3rd order) of cancellation and so on. For a greater tolerance for non-ideal speakers, to avoid frequency response errors, and to enlarge the ideal listening area, this recursive “ping-pong” correction needs to be carried out to inaudibility. It has been demonstrated that some five people can hear the same wide stage, even from two small loud speakers in front of them, using this method.

The same article also includes equations from implementing the recursive portion. I suspect it would be fairly straightforward for someone with an in-depth knowledge of C and audio DSP, but I could be wrong.

In any case, it would be nice to see an FOSS implementation of Ambiophonics, since, to my knowledge, there is no such implementation at this time. There's not even anything directly supported by Linux (there was a Java transcoder that has since disappeared and never worked that well). You can use Windows VST plugins such as: Filmaker Ambiophonics DSP (paid) or Ambio-One (free as in beer, but abandoned) but that's a little inconvenient. There's also a hardware implementation with a MiniDSP 2x4 which I'm currently using, but starts getting expensive if you want to do 4 channel panambiophonics (requiring two miniDSP 2x4). That aspect should be doable with two instances of dsp, and routing the source material correctly it can work with stereo material as well by sending the same signal to two RACE processors.

Key features:

Recommended features:

Nice to have:

Further reading:

Thank you for reading this, and I hope I don't come across as too demanding here given it's a unpaid project in your free time. I just really enjoy using this effect in my system. I would have a crack at it myself, but I suspect it's a bit beyond my knowledge level at this point.

bmc0 commented 4 years ago

The basic algorithm looks fairly simple and easy to implement. The only part that's a bit complicated is the delay block. I assume that subsample delay is desirable; otherwise the delay could only be adjusted in 1 sample increments (~22.7µs at 44.1kHz).

Something you could do in the meantime is create a "true stereo" (4 channel) impulse of an existing implementation and apply it with the zita_convolver or fir_p effect:

remix 0 0 1 1 zita_convolver ambio.wav remix 0,2 1,3
aqxa1 commented 4 years ago

Hi, thanks for the response.

I assume that subsample delay is desirable; otherwise the delay could only be adjusted in 1 sample increments (~22.7µs at 44.1kHz).

Yeah, those steps are probably too large to account for all possible configurations; 5-10us or less would be ideal.

create a "true stereo" (4 channel) impulse of an existing implementation

From what I understand, using convolution wouldn't work with this algorithm. There is a discussion on Hydrogen Audio from someone that tried to do this. Convolution can be used for applying "concert hall ambience" through some additional surround speakers, but that's not part of RACE.

bmc0 commented 4 years ago

I suspect that he didn't do it correctly. Based on the block diagram that you posted, convolution will work just fine provided that 1) The impulse is long enough, and 2) The impulse is four channels, not just two. This is what allows the required "ping-pong" between channels. A two channel impulse would only work correctly for monophonic source material.

The correct procedure for creating an FIR filter for this purpose is to send an impulse though the left input only and record both left and right outputs, then send an impulse through the right input and record both left and right outputs again. These two stereo outputs should then be merged to produce a four channel filter with the following channel mapping: LL, LR, RL, RR. (LL is left input / left output; LR is left input / right output; RL is right input / left output; RR is right input / right output).

aqxa1 commented 4 years ago

Okay, I'll try that out and see how it works.

aqxa1 commented 2 years ago

Hey, I finally got around to recording a true stereo IR, and indeed it does work, thanks! For others that want to try it, I roughly followed the HeSuVi IR recording guide except I used a pure impulse tone, rather than a 7.1 channel test file, and recorded L -> L/R and R -> L/R of ambio.one VST as separate two channel recordings, which are then exported to a 4 channel IR wav (after matching the two stereo IRs exactly at a sample level).

I'm currently using a 90ms length IR which appears to be long enough to correctly reproduce the algorithm; my initial test of a 1 sec long IR seems to be too cpu intensive when run on my phone with JamesDSP, but there may be a sweet spot in between those two, if there is an actual need for a longer IR.

bmc0 commented 2 years ago

Thanks for the update. Always nice to hear that I was right :P

I may still try to implement this algorithm at some point since subsample delay could be useful for other things too. No promises on when that might happen though.

aqxa1 commented 2 years ago

No problem. yabridge + Windows VST(s) or using convolution works well for a workaround in any case. One use case could be on slower and non-x86 systems where either of the above might not be viable, but it's not an immediate need for me.