google / spatial-media

Specifications and tools for 360º video and spatial audio.
Other
1.86k stars 429 forks source link

Spatial Audio RFC lacks a canonical decoding algorithm #139

Open modest opened 8 years ago

modest commented 8 years ago

The Spatial Audio RFC proposes a very useful standard for encoding. But the spec lacks any recommendations about decoding Spatial Audio in a consistent way (i.e. to discrete PCM channels). Without a proposed recommendation here, there is no "correct" way to play Spatial Audio on a device or convert it to channel-mapped audio. This seems like a negative quality for an RFC.

I understand that clients may experiment with their own "secret sauce" for decoding/downmixing with proprietary psychoacoustics. This doesn't replace the need for the spec to have an opinion on the canonical decoding & downmixing algorithm, though.

Decoding to mono seems unambiguous; simply amplifying the W component and dropping the remaining X, Y, Z components.

The other interesting cases for decoding are:

If nothing else, the first on this list (a canonical stereo decoding/downmixing algorithm without panning) would provide tremendous value. For example, it would enable non-spatial-audio clients to decode the same Spatial Audio audio track without needing to provide a second stereo audio track.

modest commented 7 years ago

It looks like the open source web implementation contains some hard-coded constants and logic for decoding / mixing. Are these canonical? If so, can these be documented in the spec?

https://github.com/GoogleChrome/omnitone/blob/master/src/foa-rotator.js https://github.com/GoogleChrome/omnitone/blob/master/src/foa-phase-matched-filter.js https://github.com/GoogleChrome/omnitone/blob/master/src/foa-speaker-data.js