immersive-web / webxr

Repository for the WebXR Device API Specification.
https://immersive-web.github.io/webxr/
Other
2.98k stars 381 forks source link

issues/feature request - audio #291

Closed ghost closed 6 years ago

ghost commented 6 years ago

from my point of sight, it would be interesting for you to:


furthermore:

jsantell commented 6 years ago

Songbird and Omnitone handle this with web audio

cwilso commented 6 years ago

Yep. And doing 3D audio is fairly straightforward in web audio with HRTF-mode PannerNodes.

ghost commented 6 years ago

Two things shall be considered in first place:


by trying to replicate wave field synthesis in stereo and implementing it throughout webcl, for instance:


from my standing point, an informed reading of what these environment really are would be pertinent, so, here's some propedeutic web bibliography:


Supercollider

SuperCollider is an environment and programming language originally released in 1996 by James McCartney for real-time audio synthesis and algorithmic composition.

Since then it has been evolving into a system used and further developed by both scientists and artists working with sound. It is an efficient and expressive dynamic programming language providing a framework for acoustic research, algorithmic music, interactive programming and live coding.

Released under the terms of the GPLv2 in 2002, SuperCollider is free and open-source software.


CSound

Csound is a computer programming language for sound, also known as a sound compiler or an audio programming language, or more precisely, an audio DSL. It is called Csound because it is written in C, as opposed to some of its predecessors.

It is free software, available under the LGPL.

Csound was originally written at MIT by Barry Vercoe in 1985, based on his earlier system called Music 11, which in its turn followed the MUSIC-N model initiated by Max Mathews at the Bell Labs. Its development continued throughout the 1990s and 2000s, led by John ffitch at the University of Bath. The first documented version 5 release is version 5.01 on March 18, 2006. Many developers have contributed to it, most notably Istvan Varga, Gabriel Maldonado, Robin Whittle, Richard Karpen, Michael Gogins, Matt Ingalls, Steven Yi, Richard Boulanger, and Victor Lazzarini.

Developed over many years, it currently has nearly 1700 unit generators. One of its greatest strengths is that it is completely modular and extensible by the user. Csound is closely related to the underlying language for the Structured Audio extensions to MPEG-4, SAOL.


Pure-Data

Pure Data (Pd) is a visual programming language developed by Miller Puckette in the 1990s for creating interactive computer music and multimedia works. While Puckette is the main author of the program, Pd is an open source project with a large developer base working on new extensions. It is released under a license similar to the BSD license. It runs on GNU/Linux, Mac OS X, iOS, Android and Windows. Ports exist for FreeBSD and IRIX.

Pd is very similar in scope and design to Puckette's original Max program, developed while he was at IRCAM, and is to some degree interoperable with Max/MSP, the commercial successor to the Max language. They may be collectively discussed as members of the Patcher[2] family of languages.

With the addition of the Graphics Environment for Multimedia (GEM) external, and externals designed to work with it (like Pure Data Packet / PiDiP for Linux, Mac OS X), framestein for Windows, GridFlow (as n-dimensional matrix processing, for Linux, Mac OS X, Windows), it is possible to create and manipulate video, OpenGL graphics, images, etc., in realtime with extensive possibilities for interactivity with audio, external sensors, etc.

Pd is natively designed to enable live collaboration across networks or the Internet, allowing musicians connected via LAN or even in disparate parts of the globe to create music together in real time. Pd uses FUDI as a networking protocol.


PWGL

PWGL is a program that gives the user a graphical interface to doing computer programming to create music. The interface has been designed for musicians, with many objects that allow one to see, hear, and manipulate musical materials. PWGL's interface is similar to other applications, including OpenMusic, Max/MSP, and Pd. It is most similar to OpenMusic, because both share lineage as successors to the 1980s-90s application Patchwork (the PW in PWGL refers to Patchwork.)

For those familiar with Max/MSP or Pd, the biggest difference to know about PWGL is that generally all user patches are organized in the form of a tree, with many computations that happen in the "leaves" and "branches" that feed into one another and end at the bottom of the patch with one object that is the "root." The user activates the patch by evaluating this root object, which then calls all the other objects successively up the tree to the leaves, in a recursive fashion. The outermost leaves then evaluate and feed their results back down. This happens through all levels of the patch back to the root object. When the root object evaluates, it sends the final answer to the user.

Users may evaluate the patch at locations other than the "root" object. The object called for evaluation will call up its own branches and leaves and output its result to the user. Other branches of the patch will not evaluate, nor will levels of the patch below this node. To evaluate an object, select it and hit 'v' (for "eValuate"!). Instructions for how to select objects are below.


from my perspective including any of these environments in a webvr framework would be highly benefitial because:


bearing this in mind:


so, my question is


Some content on wfs

WFS is based on the Huygens–Fresnel principle, which states that any wave front can be regarded as a superposition of elementary spherical waves. Therefore, any wave front can be synthesized from such elementary waves. In practice, a computer controls a large array of individual loudspeakers and actuates each one at exactly the time when the desired virtual wave front would pass through it.

The basic procedure was developed in 1988 by Professor A.J. Berkhout at the Delft University of Technology.[1] Its mathematical basis is the Kirchhoff-Helmholtz integral. It states that the sound pressure is completely determined within a volume free of sources, if sound pressure and velocity are determined in all points on its surface.

Therefore, any sound field can be reconstructed, if sound pressure and acoustic velocity are restored on all points of the surface of its volume. This approach is the underlying principle of holophony.

For reproduction, the entire surface of the volume would have to be covered with closely spaced monopole and dipole loudspeakers, each individually driven with its own signal. Moreover, the listening area would have to be anechoic, in order to comply with the source-free volume assumption. In practice, this is hardly feasible.

According to Rayleigh II the sound pressure is determined in each point of a half-space, if the sound pressure in each point of its dividing plane is known. Because our acoustic perception is most exact in the horizontal plane, practical approaches generally reduce the problem to a horizontal loudspeaker line, circle or rectangle around the listener.


well, I have nothing against google, they make really outstanding stuff:


from my perspective, working with pwgl, csound, libpd and supercollider, within the context of vr may, within thyself, may be benefitial, as:


and better:


in fact: supercollider:


i think it could, within thyself, be a good option to at least consider:

cwilso commented 6 years ago

I'm not sure what you're looking for, but it sounds like you'd be more interested in the Web Audio Working Group (https://www.w3.org/2011/audio/). I know most of us there are familiar with at least CSound and SuperCollider. Baking them into browsers seems very unlikely, but with forthcoming work on AudioWorklets, at least it should be possible to compile them into the web (possibly in WebAssembly, even), and connect them directly to audio outputs.

My understanding of wavefield synthesis is that it wouldn't work in stereo - it needs many transducers to invoke the wave fields, or the spatial aliasing makes it ineffective. If you just want to position synthesized or sampled sound sources in a 3D sound field, the HRTF panner in Web Audio is quite effective, and designed (obviously) for stereo headphones. If you want Ambisonics panning of streamed sources, Hongchan's work on OmniTone has proven that's quite doable on top of Web Audio too.

I would strongly advise against baking in a given sound library (other than Web Audio, which is already built into the browser implementations), and focus on the interactions - like, do we need a sound panner node that is hooked up in audio rate to a camera direction, for better tracking in a headset.

ghost commented 6 years ago

well, from my point of sight:


regarding wfs:


regarding PWGL:

toji commented 6 years ago

I certainly don't want to discourage anyone from pursuing novel approaches to various VR-related problems. However, this repo is for development of the WebVR spec itself, while this issue appears to deal primarily with audio libraries? It's not clear to me if changes to the WebVR spec (or any other spec) are being proposed here, and if the desired effects can be achieved without spec changes it's probably best to either create a new repo to implement a library in or move the discussion to the repo of an existing library with similar goals.

Closing this issue to ensure our issues list stays a bit more focused, but if I've missed something and there is indeed a spec change being proposed please let me know!