Closed gogins closed 1 year ago
In csound-wasm I need to:
Some preliminary thoughts:
Think about using some of these either to visualize music or to generate music:
-Flocking birds. -Isolines. -Nautilus. -Fractal flame.
Inigo Quilez' license is too restrictive for me, but I can probably either understand what he is doing, or find other similar code that is not so restricted.
The iChannel variables in ShaderToy are GLSL samplers.
Apparently the only way to get data computed in a shader back into the browser's JavaScript context is to create a framebuffer or a shader storage buffer, the shader writes to the buffer, JavaScript reads from the buffer. The predefined channels in ShaderToy could be used for this. Here's some documentation. Here's some more context-free documentation.
I suspect I almost have this working, but Shadertoy seems to use the default formats from the AudioAnalyser and to set that data without change of format into the texture for the sampler. I of course went for maximum precision and may have to change that, at least until I am sure it is working. See this. Shadertoy uses WebGL 2.0.
The AnalyserNode doesn't accept decibel ranges that are not negative.
https://shadertoyunofficial.wordpress.com/2016/07/20/special-shadertoy-features/ may be helpful but is quite incomplete.
A provided music or a user-chosen music from SoundCloud can be used as texture.
Texture size is bufferSize x 2. (e.g., 512 x 2). index, .75 : music samples It is a buffer refreshed along time, but the precise working seems very system dependent plus the synchronization with image frames is not guaranteed. So drawing the full soundwave or using a naive sound encoding of image won’t be faithful (or possibly only for you). index, .25: FFT of the buffer x / bufferSize = f / (iSampleRate /4.) , example here. iSampleRate give the sampling rate, but it seems incorrectly initialized on many systems: if precision is important, try manually 44100 or 48000.
To summarize what I have perhaps mistakenly gleaned:
I'm still not sure I have this right or if it is really working. But I must proceed to more interesting music.
VENDOR: WebKit RENDERER: WebKit WebGL GL_VERSION: WebGL 2.0 (OpenGL ES 3.0 Chromium) SHADING_LANGUAGE_VERSION: WebGL GLSL ES 3.00 (OpenGL ES GLSL ES 3.0 Chromium)
This is quite frustrating. Strategy:
I just learned something. In Shadertoy if one clicks on the "Compiled in x seconds" label, a window pops up containing "Translated Shader Code." It specifies a GLSL version. The code is somewhat different than what one views in the online editor. In particular the samplers are structs. One does not access a struct as a whole, but using the location of each field, e.g. mystruct.field1, mystruct.field2.
I failed to declare a version of GLSL matching the code (and browser). This means that features not implemented in version 1 should fail -- I think. I added the declaration and perhaps now things will go better. I think some of the fog is beginning to dissipate. This is why the Shadertoy code editor can display translated shader code -- the code in the editor has to be fiddled with and versioned before it can be run.
Some slight progress. I have proved that #version 300 es
must be the first line of all shader code. This makes the texelFetch
and texture2D
errors go away. In addition attributes are no longer supported and are replaced by in
parameters. But the audio is still not coming in.
Perhaps there are other version incompatibilities regarding textures, samplers, or bindings.
https://madethisthing.com/iq/piLibs-JS is iq's library that wraps webgl2. This is used in Shadertoy.com and I think this code is a reliable guide on how to do things right. So far, everything I am doing seems right. But so far I see no code for loading audio in textures or samplers.
Although the Shadertoy.com code is not visible, this is a really great resource from somebody who seems to know a lot and do things well.
This offers some historical context:
Aside from his contributions to films and games, Pol [Jeremias] co-founded Beautypi LLC with Iñigo Quilez, a company dedicated to bringing computer graphics everywhere. In 2013, Beautypi released Shadertoy.com, a global social network with tens of thousands of contributors that enables graphics enthusiasts to create and share computer graphics knowledge. Today, Shadertoy is one of the biggest repositories of computer graphics experiments, ideas, and projects. In 2020, Beautypi released Memix, a software to improve video conferencing using computer graphics technology. Memix was acquired that same year by Mmhmm Inc.
It's clear Quilez is secretive about Shadertoy.com source code even though the Web site hosts open source shader toys. He probably works on proprietary libraries. At this point my options are narrowing.
smoothstep
is one. Not so.Maybe more complete demos: https://github.com/WebGLSamples/WebGL2Samples. This is the repository that ended up solving my problem by providing complete JavaScript code for properly setting up sampling for a texture.
It is necessary to usegl.getExtension("EXT_color_buffer_float");
in order to verify that audio has been correctly imported into a floating-point texture. This is a known issue.
This is fixed. I had to set sampler parameters rather than texture parameters. Also, to use nearest filtering not linear filtering. Things are starting to happen although mapping texture to display is still not correct.
Implementation of a neural network. Inverse Lyupanov Journey, moving images can be sampled as they pass through a line, plane, or volume. Inside the Mandelbulb, rather beautiful. Fractal mosaic. Fractal mosaic 16. Bacterium (should be plural). Orbit trap periods. Mouse julia. Fast deep Mandelbrot zoom. Mandelbrot/Julia, could move a sampling line through the picked Julia sets. Interactive Mandelbrot zoom, straightforward. Sierpinski pyramid.
Moving images/objects can be sampled as they pass through a line, plane, or volume in view coordinates, or relative to the object itself.
In a shader, there are no vertices as such, it's all about the texels. These are simply color values in the viewport, vec4(r,g,b,a). If the texture is gl.RGBA32F, which mine will be, the values will be very precise.
So to be clear, we are not sampling from objects, but from the state of texels in a region of the viewport. That region could be of any form. Alpha will normally be 1, so we are really dealing with vec3(r,g,b).
When a texel crosses the sampling point, if it exceeds a threshold of some sort, it triggers a Csound event. When the value in that texel falls below that threshold, the event is turned off.
The shape of the sampling region would most intuitively be understood if a vertical or horizontal line through the center, or one of the borders, of the viewport. Then there are possibilities for sampling. One event could be one texel, or several texels.
We are not mapping time or duration because we are a dynamical system ticking through time on every view rendering. So a single texel could represent (instrument, key, velocity). Obviously, there is way too much data in the image, the image bandwidth far exceeds the Csound event bandwidth.
If the events are grains, the event bandwidth can be much higher, then we are in Xenakis land.
Alternatively, the sampling line could be mapped to N instruments, then width/N texels are available for mapping to the events. For Csound instruments this is excessive, but this data could go into function tables for a variety of uses.
For Csound instruments, the line can be sampled at a lower level of detail, i.e. filtered, to reduce the bandwidth.
Either the GLSL code can directly return sampled texels that JavaScript code will then map to Csound events, or the values of the Csound events can be directly computed in the shader and returned in a significantly smaller array, this seems likely to be considerably more efficient and to give more of the computer power to the Csound performance and to the shader itself.
Another way of reducing the bandwidth is to produce only a few texels above the sampling threshold, e.g. only a few bright sprites or something.
Of course for a fixed image, the sampling line or plane can move through the image.
We need either an image that moves continuously through the viewport, or an image that is interactively controlled by the user and sampled by passing a line or plane through the image. On my desktop screen:
canvas.height: 1771
canvas.width: 3544
This is pretty high resolution and very high bandwidth.
If we map width to the piano range of 88 semitones we have 40 texels per semitone. I think it will be simplest, and fast enough, to simply average each 40 texels. Then we are comparing current sample to prior sample and threshold.
Considering the limitations of GLSL, and the overhead of copying data into and out of the GPU with SSBOs or textures or whatever, it is now seeming optimal to just read the pixels in the rendered image frame in the canvas along the sampling line. This is similar to what I did in "Unperformed Experiments." Probably an experiment is required to compare overheads, but I can start with just reading the image to get a piece going. However, just doing this in the obvious way hits performance with pipeline stalls, it's necessary to use a pixel buffer object (PBO).
Did a lot of Googling and this is an inherently messy and tradeoff-ridden topic. Right now, it looks like pixel buffer objects are the best solution. The GPU will write to the PBO and then the CPU can test to see if the PBO is ready. Otherwise the pipeline will still stall. There will need to be at least one fence (one for the PBO, and maybe one for the PBO to Csound processing).
(Groan...) Now there is WebGPU. Not yet ready for prime time, but might make sense anyway. Doesn't seem to be available on macOS in my versions of Chrome or Safari.
There are then two synchronization issues, within the very multithreaded fragment shader invocations, and between the GPU and the CPU.
For now it seems best to use gl.readPixels
asynchronously to obtain one row of the current frame buffer.
The real issue for me is synchronization between the fragment shader invocations and reading the SSBO in the CPU. This summarizes the situation. Memory barriers do this.
Mapping issues:
My mapping algorithm is no good. Here's another try.
The problem is still the high bandwidth of the visual signal versus the low bandwidth of the Csound event signal. We want the relationship between the visual animation and the Csound audio to be reasonably apparent to the viewer.
I think the best approach is still to filter the bottom row, or middle row, of the canvas down to a resolution the same as the number of MIDI keys to be played. This could be down with simple downsampling, averaging before downsampling, or Gaussian downsampling.
After that, filter the loudest N events from the downsampled buffer. But this only makes sense if the canvas can be sampled at a high enough rate. I will determine that rate before I decide to implement this algorithm.
If the rate is not high enough, or if I implement it all just fine but it still doesn't seem right, I will drop sampling GLSL generated visuals and change to sampling WebGL objects (vertices), which could be done with Three.js or p5.js.
There may still be a problem with the async read of the canvas. Making it async may mean that the reads are piling up. I will log them also. And indeed:
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.708 parent frame: 997 current frame: 997
prototype_score_generator.html:1162 render_scene: time: 17.71109999999404 frame: 998
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.72209999999404 parent frame: 998 current frame: 998
prototype_score_generator.html:1162 render_scene: time: 17.73059999999404 frame: 999
prototype_score_generator.html:1162 render_scene: time: 17.74540000000596 frame: 1000
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.75469999998808 parent frame: 999 current frame: 1000
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.755900000005962 parent frame: 1000 current frame: 1000
prototype_score_generator.html:1162 render_scene: time: 17.7615 frame: 1001
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.77409999999404 parent frame: 1001 current frame: 1001
prototype_score_generator.html:1162 render_scene: time: 17.78140000000596 frame: 1002
prototype_score_generator.html:1162 render_scene: time: 17.7955 frame: 1003
But I'm not sure this is a real problem.
The idea is to blur the middle row of the canvas such that each of the plurred pixels is mapped to one MIDI key number, and each of the bright enough pixels is played as a Csound "note on" event. Events don't carry their own state, but are stored in queues by state: sampled_events, on_events, playing_events, and off_events.
Before rendering, all queues are cleared.
On each Nth rendering frame:
If the loudness threshold is high enough it doesn't seem necessary to set a maximum number of voices, although that should also be possible.
This seems to be working more or less. The reverb is clattering so I will examine the level and release envelope situation.
The instruments need to be fixed to handle note on/note off pairs.
There is an issue in that the Emscripten toolchain does not permit for WebAssembly code or JavaScript code to read the native filesystem. There are workarounds but I do not like them.
This prevents Csound from using #include
. For now, I will just include all Csound orc code in the HTML file.
Things are better but ZakianFlute is still piling up instances.
Synchronizing score generation in loops/segments... the issue is that the system clock and Csound's score time may drift apart.
csoundSetOutputChannelCallback
as Csound.SetOutputChannelCallback(callable)
would do that. This is pretty much the same as the timer, except that Csound itself resets the timer.The C prototype for the callback is: void(* channelCallback_t) (CSOUND *csound, const char *channelName, void *channelValuePtr, const void *channelType)
. In the preset context probably this would do: (channel_name, value) => {}
.
Using a JavaScript timer should be adequate for all cases I can foresee. The timer can be reset with different durations depending on when the generated segment will end. There could of course be any number of such "tracks" going on at the time, and they also can be nested.
There is a bug in the Csound infoff
function, truncates p1 to int. I will probably have to write some test cases and/or try to fix the bug.
I think I can use this if it is public or I can make it public:
* Kills off one or more running instances of an instrument identified
* by instr (number) or instrName (name). If instrName is NULL, the
* instrument number is used.
* Mode is a sum of the following values:
* 0,1,2: kill all instances (1), oldest only (1), or newest (2)
* 4: only turnoff notes with exactly matching (fractional) instr number
* 8: only turnoff notes with indefinite duration (p3 < 0 or MIDI)
* allow_release, if non-zero, the killed instances are allowed to release.
*/
PUBLIC int csoundKillInstance(CSOUND *csound, MYFLT instr,
char *instrName, int mode, int allow_release);
playpen.py
for additional tests: note off turns off all note ons with same tag, note off turns off any note on with same tag even if pitches differ. Seems to be working just fine.kill_instance
callingxturnoff
for the turnoff
opcode and just xturnoff2
for theturnoff2
opcode, and delete_selected_rt_events
for the turnoff3
opcode. The insert_event
, insert_midi
, infoff
, and kill_instance
functions , and the csoundKillInstance
API function, call xturnoff
or xturnoff_now
. There is a gap in my understanding between xturnoff2
and deleted_selected_rt_events
, it is conceivable this is my problem as the cloud-music pieces rely on ReadScore
.turnoff2
and turnoff3
opcodes, the latter just clears events from the real-time queue.csoundKillInstance
in the WASM API and use that.maxalloc
stopping note offs from getting through? Not clear, but not relevant with csoundKillInstance
.csoundKillInstance
.To finish the prototype score generator:
Try:
There were problems on Android. At least for the tablet, I fixed the problems by using highp instead of medium and not using the pixel density. The tablet doesn't really have enough oomph and there are dropouts and glitches, but the visuals are just as good and everything is basically working.
For Cloud Music No. 2, use the existing visualizer and Csound orchestra. Replace the fixed Csound score with a JavaScript score generator. Instead of a Silence score, use JavaScript template strings to format i
statements for the existing orchestra, which sounds quite good.
I am searching for JavaScript code for various chaotic dynamical systems that can be mapped to various chord spaces.
StrangeAttractor
class in CsoundAC to produce not just notes but points in PITV space. I don't even think this requires any new code, I can use existing methods of the class and get X, Y, Z, and W after each iteration.It was necessary to add one method StrangeAttractor::iterate_without_rendering
before I could proceed. I also added convenience functions to get the state of the attractor in normalized form.
Added code to copy and paste controls state to and from the system clipboard.
Two bugs:
Get this thing off the ground:
<script>
elements and commenting them. Actually, not doing this due to the mixture of languages (HTML, JavaScript, WebGL/GLSL, Csound. I simply put in a comment.