Get this show on the road

gogins commented 2 years ago

Get this thing off the ground:

[x] Get csound-wasm working.
[x] Install csound-wasm.
[x] Do a proof of concept for generating an always-playing score, and using a shader to visualize the music.
[x] This in turn requires, for always-on pieces, some means of synchronizing score generation in segments or loops. It will not do to generate an infinitely long score in one go.
[x] Do a proof of concept for using a shader to generate a score.
[x] This in turn requires integrating chord spaces and scales while generating scores from sampled visuals.
[x] Provide examples with basic user controls, including at least audio level and reverb.
[x] Enable saving controls to local storage, restoring them from local storage, and restoring them from defaults. Also, copy to and from the system clipboard.
[x] Refactor code to preserve encapsulation. This can be done, I think, simply by using different <script> elements and commenting them. Actually, not doing this due to the mixture of languages (HTML, JavaScript, WebGL/GLSL, Csound. I simply put in a comment.
[x] Improve GitHub pages.
[x] Fill out user controls for Nos. 1 and 2.
[x] Fill out help, credits, and resources for Nos. 1 and 2.
[x] Change No. 1 to use tempo.
[x] Use only local resources from the server root.
[x] See if there is anything to do about the reverb slap in No. 1.
[x] Decent default parameters. They need to take effect the first time a piece is loaded.
[x] No. 2 sometimes hangs after a while. Can't reproduce in debugger.

gogins commented 2 years ago

In csound-wasm I need to:

[x] Either include, or remove, the STK opcodes.
[x] Remove the linear algebra opcodes.
[x] Migrate from csound-extended to csound-ac.
[x] Simplify the build system.

gogins commented 2 years ago

Some preliminary thoughts:

Absolutely every detail must be of the highest artistic quality.
Pieces should be designed for full-screen mode.
Some pieces will be music, with accompanying visuals that need not be complex.
Some pieces will be visual music, where the visuals generate the music.
Controls should be unobtrusive yet intuitive to use.
Some pieces should not have any controls.
It should be possible to view the source code of the piece.
The piece should have program notes.
Some pieces will consist of variations, other pieces will be dynamical systems.
User settings will go both into the clipboard and into local storage. They can copied from the clipboard back into the piece.

gogins commented 2 years ago

Think about using some of these either to visualize music or to generate music:

-Flocking birds. -Isolines. -Nautilus. -Fractal flame.

Inigo Quilez' license is too restrictive for me, but I can probably either understand what he is doing, or find other similar code that is not so restricted.

gogins commented 2 years ago

[x] The browser inspector, source viewer, and console are better than anything that I have made, so at least for now, I will use them. But it means getting them from the browser, not from the piece. Document this in "about."
[x] Can't have indefinite pieces logging to any element unless it is periodically cleared.
[x] Layout is fiddly, can't get scrollbars or scrolling to work right. This must be fixed. Fixed with w3.css, may need work for more complex pieces.
[x] Website layout: I want WASM files and other resources in the root directory, each piece in its own subdirectory. How to do? Can't do without changing csound-wasm code, so not doing now.

gogins commented 2 years ago

The iChannel variables in ShaderToy are GLSL samplers.

gogins commented 2 years ago

Apparently the only way to get data computed in a shader back into the browser's JavaScript context is to create a framebuffer or a shader storage buffer, the shader writes to the buffer, JavaScript reads from the buffer. The predefined channels in ShaderToy could be used for this. Here's some documentation. Here's some more context-free documentation.

gogins commented 2 years ago

https://www.shadertoy.com/view/3dXSDB
https://www.shadertoy.com/view/4tGXzt
https://www.shadertoy.com/view/ls3Sz4
https://www.shadertoy.com/view/Xds3Rr# This is the "simple sound visualizer."

gogins commented 2 years ago

I suspect I almost have this working, but Shadertoy seems to use the default formats from the AudioAnalyser and to set that data without change of format into the texture for the sampler. I of course went for maximum precision and may have to change that, at least until I am sure it is working. See this. Shadertoy uses WebGL 2.0.

gogins commented 2 years ago

The AnalyserNode doesn't accept decibel ranges that are not negative.

gogins commented 2 years ago

https://shadertoyunofficial.wordpress.com/2016/07/20/special-shadertoy-features/ may be helpful but is quite incomplete.

A provided music or a user-chosen music from SoundCloud can be used as texture.

Texture size is bufferSize x 2. (e.g., 512 x 2). index, .75 : music samples It is a buffer refreshed along time, but the precise working seems very system dependent plus the synchronization with image frames is not guaranteed. So drawing the full soundwave or using a naive sound encoding of image won’t be faithful (or possibly only for you). index, .25: FFT of the buffer x / bufferSize = f / (iSampleRate /4.) , example here. iSampleRate give the sampling rate, but it seems incorrectly initialized on many systems: if precision is important, try manually 44100 or 48000.

gogins commented 2 years ago

To summarize what I have perhaps mistakenly gleaned:

Both frequency domain and time domain data are sampled only up to audio frame rate divided by four.
Frequency domain data appears to be normalized to [0, 1].
Time domain data appears to be normalized to [-1, 1].

I'm still not sure I have this right or if it is really working. But I must proceed to more interesting music.

gogins commented 2 years ago

VENDOR: WebKit RENDERER: WebKit WebGL GL_VERSION: WebGL 2.0 (OpenGL ES 3.0 Chromium) SHADING_LANGUAGE_VERSION: WebGL GLSL ES 3.00 (OpenGL ES GLSL ES 3.0 Chromium)

This is quite frustrating. Strategy:

[x] Nail down the types and ranges of frequency domain and time domain samples in Shadertoy.
Samplers return floats. They appear to be in [0, 1].
[x] Clearly distinguish the types of variables (objects?), texture units, object IDs, and so on with naming conventions.
unqualified names for objects. This is complicated by the fact that the JavaScript interface to WebGL uses opaque objects where the C++ interface to the OpenGL libraries uses integer IDs.
x_location for uniform locations.
x_tun for texture unit numbers.
x_texture or x_sampler for texture and sampler IDs.
[x] Re-document all WebGL and GLSL calls involved.
[x] Search for working WebGL2 examples that use sampler2D and are NOT in Shadertoy.
[x] Clarify the sequence of calls:
1. gl.linkProgram(program)
2. program.x_location = gl.getUniformLocation(program, "x")
3. gl.useProgram(program)
4. gl.uniform1i(program.x_location, x_tun)
5. gl.activeTexture(gl.TEXTURE0 + x_tun)
6. x_texture = gl.createTexture()
7. gl.bindTexture(gl.TEXTURE_2D, x_texture);
8. gl.texImage2D(gl.TEXTURE_2D, ...)
9. gl.bindSampler(program.x_location, x.tun);
[x] Find whether the Shadertoy web site runs shaders on the server or in the browser client. It seems that the browser client is quite complex and so far it seems as though the client does not actually run the shader, perhaps it passes the visible editor code to classes that take it apart and put it back together again and run it, either locally or on the server. There are libraries from iq that are used and I will look through them. I think it runs in a client browser after a lot of processing on the server.

gogins commented 2 years ago

I just learned something. In Shadertoy if one clicks on the "Compiled in x seconds" label, a window pops up containing "Translated Shader Code." It specifies a GLSL version. The code is somewhat different than what one views in the online editor. In particular the samplers are structs. One does not access a struct as a whole, but using the location of each field, e.g. mystruct.field1, mystruct.field2.

gogins commented 2 years ago

I failed to declare a version of GLSL matching the code (and browser). This means that features not implemented in version 1 should fail -- I think. I added the declaration and perhaps now things will go better. I think some of the fog is beginning to dissipate. This is why the Shadertoy code editor can display translated shader code -- the code in the editor has to be fiddled with and versioned before it can be run.

gogins commented 2 years ago

Some slight progress. I have proved that #version 300 es must be the first line of all shader code. This makes the texelFetch and texture2D errors go away. In addition attributes are no longer supported and are replaced by in parameters. But the audio is still not coming in.

Perhaps there are other version incompatibilities regarding textures, samplers, or bindings.

gogins commented 2 years ago

https://madethisthing.com/iq/piLibs-JS is iq's library that wraps webgl2. This is used in Shadertoy.com and I think this code is a reliable guide on how to do things right. So far, everything I am doing seems right. But so far I see no code for loading audio in textures or samplers.

Although the Shadertoy.com code is not visible, this is a really great resource from somebody who seems to know a lot and do things well.

This offers some historical context:

Aside from his contributions to films and games, Pol [Jeremias] co-founded Beautypi LLC with Iñigo Quilez, a company dedicated to bringing computer graphics everywhere. In 2013, Beautypi released Shadertoy.com, a global social network with tens of thousands of contributors that enables graphics enthusiasts to create and share computer graphics knowledge. Today, Shadertoy is one of the biggest repositories of computer graphics experiments, ideas, and projects. In 2020, Beautypi released Memix, a software to improve video conferencing using computer graphics technology. Memix was acquired that same year by Mmhmm Inc.

gogins commented 2 years ago

It's clear Quilez is secretive about Shadertoy.com source code even though the Web site hosts open source shader toys. He probably works on proprietary libraries. At this point my options are narrowing.

[x] Research the WebGL GLSL ES 3.00 official standards.
[x] Look harder through Quilez' code and writings.
[x] Look harder for open source code that uses samplers and textures in a way that can replicate what Shadertoy.com does.
[x] Shadertoy.com help documents functions that may be in Shadertoy.com code but not in the GLSL standard. Perhaps smoothstep is one. Not so.

gogins commented 2 years ago

Maybe more complete demos: https://github.com/WebGLSamples/WebGL2Samples. This is the repository that ended up solving my problem by providing complete JavaScript code for properly setting up sampling for a texture.

gogins commented 2 years ago

It is necessary to usegl.getExtension("EXT_color_buffer_float"); in order to verify that audio has been correctly imported into a floating-point texture. This is a known issue.

gogins commented 2 years ago

This is fixed. I had to set sampler parameters rather than texture parameters. Also, to use nearest filtering not linear filtering. Things are starting to happen although mapping texture to display is still not correct.

gogins commented 2 years ago

Implementation of a neural network. Inverse Lyupanov Journey, moving images can be sampled as they pass through a line, plane, or volume. Inside the Mandelbulb, rather beautiful. Fractal mosaic. Fractal mosaic 16. Bacterium (should be plural). Orbit trap periods. Mouse julia. Fast deep Mandelbrot zoom. Mandelbrot/Julia, could move a sampling line through the picked Julia sets. Interactive Mandelbrot zoom, straightforward. Sierpinski pyramid.

gogins commented 2 years ago

Moving images/objects can be sampled as they pass through a line, plane, or volume in view coordinates, or relative to the object itself.

In a shader, there are no vertices as such, it's all about the texels. These are simply color values in the viewport, vec4(r,g,b,a). If the texture is gl.RGBA32F, which mine will be, the values will be very precise.

So to be clear, we are not sampling from objects, but from the state of texels in a region of the viewport. That region could be of any form. Alpha will normally be 1, so we are really dealing with vec3(r,g,b).

gogins commented 2 years ago

When a texel crosses the sampling point, if it exceeds a threshold of some sort, it triggers a Csound event. When the value in that texel falls below that threshold, the event is turned off.

The shape of the sampling region would most intuitively be understood if a vertical or horizontal line through the center, or one of the borders, of the viewport. Then there are possibilities for sampling. One event could be one texel, or several texels.

We are not mapping time or duration because we are a dynamical system ticking through time on every view rendering. So a single texel could represent (instrument, key, velocity). Obviously, there is way too much data in the image, the image bandwidth far exceeds the Csound event bandwidth.

If the events are grains, the event bandwidth can be much higher, then we are in Xenakis land.

Alternatively, the sampling line could be mapped to N instruments, then width/N texels are available for mapping to the events. For Csound instruments this is excessive, but this data could go into function tables for a variety of uses.

For Csound instruments, the line can be sampled at a lower level of detail, i.e. filtered, to reduce the bandwidth.

Either the GLSL code can directly return sampled texels that JavaScript code will then map to Csound events, or the values of the Csound events can be directly computed in the shader and returned in a significantly smaller array, this seems likely to be considerably more efficient and to give more of the computer power to the Csound performance and to the shader itself.

Another way of reducing the bandwidth is to produce only a few texels above the sampling threshold, e.g. only a few bright sprites or something.

Of course for a fixed image, the sampling line or plane can move through the image.

gogins commented 2 years ago

We need either an image that moves continuously through the viewport, or an image that is interactively controlled by the user and sampled by passing a line or plane through the image. On my desktop screen:

canvas.height: 1771
canvas.width:  3544

This is pretty high resolution and very high bandwidth.

If we map width to the piano range of 88 semitones we have 40 texels per semitone. I think it will be simplest, and fast enough, to simply average each 40 texels. Then we are comparing current sample to prior sample and threshold.

gogins commented 2 years ago

Considering the limitations of GLSL, and the overhead of copying data into and out of the GPU with SSBOs or textures or whatever, it is now seeming optimal to just read the pixels in the rendered image frame in the canvas along the sampling line. This is similar to what I did in "Unperformed Experiments." Probably an experiment is required to compare overheads, but I can start with just reading the image to get a piece going. However, just doing this in the obvious way hits performance with pipeline stalls, it's necessary to use a pixel buffer object (PBO).

gogins commented 2 years ago

https://stackoverflow.com/questions/24495410/how-to-read-a-pixel-depth-value-without-stalling-the-pipeline#24496636

gogins commented 2 years ago

Did a lot of Googling and this is an inherently messy and tradeoff-ridden topic. Right now, it looks like pixel buffer objects are the best solution. The GPU will write to the PBO and then the CPU can test to see if the PBO is ready. Otherwise the pipeline will still stall. There will need to be at least one fence (one for the PBO, and maybe one for the PBO to Csound processing).

gogins commented 2 years ago

(Groan...) Now there is WebGPU. Not yet ready for prime time, but might make sense anyway. Doesn't seem to be available on macOS in my versions of Chrome or Safari.

gogins commented 2 years ago

There are then two synchronization issues, within the very multithreaded fragment shader invocations, and between the GPU and the CPU.

For now it seems best to use gl.readPixels asynchronously to obtain one row of the current frame buffer.

The real issue for me is synchronization between the fragment shader invocations and reading the SSBO in the CPU. This summarizes the situation. Memory barriers do this.

gogins commented 2 years ago

[x] Create a shader storage buffer object for returning sampled data from the fragment shader.
[x] Ensure that the shader pipeline is finished before translating to Csound events using memory barriers. Because with Chrome on macOS the maximum timeout is always 0, _this had to be done with a timer workaround._
[x] Put in a different shader, Fractal mosaic 16.
[x] Put in a different Csound orchestra.
[x] Implement translation from the sampled row to Csound events.
[x] Implement use of chord spaces and scales.

gogins commented 2 years ago

Mapping issues:

[x] Is the variance in value too small? Maybe.
[x] Is there some logic error in the sampling, looping, and score generation? Not that I can see or print.
[x] Are variables wrongly scoped in that code? Not that I can see.
[x] Should I use some form of cluster analysis rather than simply averaging over "lanes?" Not yet.
[x] Determine the mean sampling rate for the bottom row of the canvas. Seems to be on the order of 50 frames per second.

gogins commented 2 years ago

My mapping algorithm is no good. Here's another try.

The problem is still the high bandwidth of the visual signal versus the low bandwidth of the Csound event signal. We want the relationship between the visual animation and the Csound audio to be reasonably apparent to the viewer.

I think the best approach is still to filter the bottom row, or middle row, of the canvas down to a resolution the same as the number of MIDI keys to be played. This could be down with simple downsampling, averaging before downsampling, or Gaussian downsampling.

After that, filter the loudest N events from the downsampled buffer. But this only makes sense if the canvas can be sampled at a high enough rate. I will determine that rate before I decide to implement this algorithm.

If the rate is not high enough, or if I implement it all just fine but it still doesn't seem right, I will drop sampling GLSL generated visuals and change to sampling WebGL objects (vertices), which could be done with Three.js or p5.js.

gogins commented 2 years ago

There may still be a problem with the async read of the canvas. Making it async may mean that the reads are piling up. I will log them also. And indeed:

prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.708 parent frame: 997 current frame: 997
prototype_score_generator.html:1162 render_scene:                      time: 17.71109999999404 frame: 998
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.72209999999404 parent frame: 998 current frame: 998
prototype_score_generator.html:1162 render_scene:                      time: 17.73059999999404 frame: 999
prototype_score_generator.html:1162 render_scene:                      time: 17.74540000000596 frame: 1000
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.75469999998808 parent frame: 999 current frame: 1000
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.755900000005962 parent frame: 1000 current frame: 1000
prototype_score_generator.html:1162 render_scene:                      time: 17.7615 frame: 1001
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.77409999999404 parent frame: 1001 current frame: 1001
prototype_score_generator.html:1162 render_scene:                      time: 17.78140000000596 frame: 1002
prototype_score_generator.html:1162 render_scene:                      time: 17.7955 frame: 1003

But I'm not sure this is a real problem.

gogins commented 2 years ago

The idea is to blur the middle row of the canvas such that each of the plurred pixels is mapped to one MIDI key number, and each of the bright enough pixels is played as a Csound "note on" event. Events don't carry their own state, but are stored in queues by state: sampled_events, on_events, playing_events, and off_events.

Before rendering, all queues are cleared.

On each Nth rendering frame:

All pixels in the downsampled row are translated to Csound events.
All events that are not already playing and that are loud enough are played.
Events that are playing continue playing until a new event for the same MIDI key comes in that is not loud enough, then the playing event is turned off.

If the loudness threshold is high enough it doesn't seem necessary to set a maximum number of voices, although that should also be possible.

This seems to be working more or less. The reverb is clattering so I will examine the level and release envelope situation.

gogins commented 2 years ago

The instruments need to be fixed to handle note on/note off pairs.

gogins commented 1 year ago

There is an issue in that the Emscripten toolchain does not permit for WebAssembly code or JavaScript code to read the native filesystem. There are workarounds but I do not like them.

This prevents Csound from using #include. For now, I will just include all Csound orc code in the HTML file.

gogins commented 1 year ago

Things are better but ZakianFlute is still piling up instances.

gogins commented 1 year ago

Synchronizing score generation in loops/segments... the issue is that the system clock and Csound's score time may drift apart.

[x] All synchronization is just the animation loop. Too variable, won't use.
[x] All synchronization is just a JavaScript timer. Think this would almost always work, will try first.
[x] Csound tells the browser when to generate another segment. Using csoundSetOutputChannelCallback as Csound.SetOutputChannelCallback(callable) would do that. This is pretty much the same as the timer, except that Csound itself resets the timer.

The C prototype for the callback is: void(* channelCallback_t) (CSOUND *csound, const char *channelName, void *channelValuePtr, const void *channelType). In the preset context probably this would do: (channel_name, value) => {}.

gogins commented 1 year ago

Using a JavaScript timer should be adequate for all cases I can foresee. The timer can be reset with different durations depending on when the generated segment will end. There could of course be any number of such "tracks" going on at the time, and they also can be nested.

gogins commented 1 year ago

There is a bug in the Csound infoff function, truncates p1 to int. I will probably have to write some test cases and/or try to fix the bug.

gogins commented 1 year ago

I think I can use this if it is public or I can make it public:

   * Kills off one or more running instances of an instrument identified
   * by instr (number) or instrName (name). If instrName is NULL, the
   * instrument number is used.
   * Mode is a sum of the following values:
   * 0,1,2: kill all instances (1), oldest only (1), or newest (2)
   * 4: only turnoff notes with exactly matching (fractional) instr number
   * 8: only turnoff notes with indefinite duration (p3 < 0 or MIDI)
   * allow_release, if non-zero, the killed instances are allowed to release.
   */
  PUBLIC int csoundKillInstance(CSOUND *csound, MYFLT instr,
                                char *instrName, int mode, int allow_release);

gogins commented 1 year ago

[x] Add to playpen.py for additional tests: note off turns off all note ons with same tag, note off turns off any note on with same tag even if pitches differ. Seems to be working just fine.
[x] Find what low level functions are called for note offs. They are kill_instance callingxturnoff for the turnoff opcode and just xturnoff2 for theturnoff2 opcode, and delete_selected_rt_events for the turnoff3 opcode. The insert_event, insert_midi, infoff, and kill_instance functions , and the csoundKillInstance API function, call xturnoff or xturnoff_now. There is a gap in my understanding between xturnoff2 and deleted_selected_rt_events, it is conceivable this is my problem as the cloud-music pieces rely on ReadScore.
[x] Read again the manual for the turnoff2 and turnoff3 opcodes, the latter just clears events from the real-time queue.
[x] Expose csoundKillInstance in the WASM API and use that.
[x] Is maxalloc stopping note offs from getting through? Not clear, but not relevant with csoundKillInstance.
[x] Test note offs the same way from ReadScore as from standard score. Not needed with csoundKillInstance.
[x] Change the way notes are tagged in the prototype: format a string from insno and key.

gogins commented 1 year ago

To finish the prototype score generator:

[ ] There are still clicks!!! I think they are Kung2 or Kung4. Kung2 for sure. Fixed that, but now some occasional reverb slaps.
[x] There are too many hanging notes!!! Fixed by using saturation as duration factor.
[x] The modulations code returns undefined objects at times, this should never happen. This was just wrong signatures in calls.
[ ] Longer-term variations (tempo, chord type, etc.).
[x] Fix envelopes in kung3 which is otherwise very nice.
[x] Perhaps refine the style of the user interface (text colors etc.).

gogins commented 1 year ago

Try:

[ ] Persian carpets.
[ ] Sierpinski "automaton".

gogins commented 1 year ago

There were problems on Android. At least for the tablet, I fixed the problems by using highp instead of medium and not using the pixel density. The tablet doesn't really have enough oomph and there are dropouts and glitches, but the visuals are just as good and everything is basically working.

gogins commented 1 year ago

For Cloud Music No. 2, use the existing visualizer and Csound orchestra. Replace the fixed Csound score with a JavaScript score generator. Instead of a Silence score, use JavaScript template strings to format i statements for the existing orchestra, which sounds quite good.

gogins commented 1 year ago

I am searching for JavaScript code for various chaotic dynamical systems that can be mapped to various chord spaces.

chua_simulation, but Python with numpy.
Fibre, looks promising, many presets, MIT license. This however is about continuous dynamical systems and their vector fields, and depends on GLSL computations.
Guillaume Pelletier-Auger.
JuliaDynamics in Julia.
Adapt my own StrangeAttractor class in CsoundAC to produce not just notes but points in PITV space. I don't even think this requires any new code, I can use existing methods of the class and get X, Y, Z, and W after each iteration.

gogins commented 1 year ago

It was necessary to add one method StrangeAttractor::iterate_without_rendering before I could proceed. I also added convenience functions to get the state of the attractor in normalized form.

gogins commented 1 year ago

Added code to copy and paste controls state to and from the system clipboard.

gogins commented 1 year ago

Two bugs:

[x] Notes peter out in No. 1 for no apparent reason, but if I resize the canvas then only the Guitar gets played (!). Fixed, this was down to mixed units for loudness threshold.
[x] In No. 2 the PITV doesn't initialize unless the code for the StrangeAttractor is logged (!). Moved initialization of attractor ahead of rendering loop.

gogins / cloud-5

Get this show on the road #1