gogins / cloud-5

Complete browser-based computer music studio for algorithmic composition and live coding
Other
9 stars 1 forks source link

Get this show on the road #1

Closed gogins closed 1 year ago

gogins commented 2 years ago

Get this thing off the ground:

gogins commented 2 years ago

In csound-wasm I need to:

gogins commented 2 years ago

Some preliminary thoughts:

gogins commented 2 years ago

Think about using some of these either to visualize music or to generate music:

-Flocking birds. -Isolines. -Nautilus. -Fractal flame.

Inigo Quilez' license is too restrictive for me, but I can probably either understand what he is doing, or find other similar code that is not so restricted.

gogins commented 2 years ago
gogins commented 2 years ago

The iChannel variables in ShaderToy are GLSL samplers.

gogins commented 2 years ago

Apparently the only way to get data computed in a shader back into the browser's JavaScript context is to create a framebuffer or a shader storage buffer, the shader writes to the buffer, JavaScript reads from the buffer. The predefined channels in ShaderToy could be used for this. Here's some documentation. Here's some more context-free documentation.

gogins commented 2 years ago
gogins commented 2 years ago

I suspect I almost have this working, but Shadertoy seems to use the default formats from the AudioAnalyser and to set that data without change of format into the texture for the sampler. I of course went for maximum precision and may have to change that, at least until I am sure it is working. See this. Shadertoy uses WebGL 2.0.

gogins commented 2 years ago

The AnalyserNode doesn't accept decibel ranges that are not negative.

gogins commented 2 years ago

https://shadertoyunofficial.wordpress.com/2016/07/20/special-shadertoy-features/ may be helpful but is quite incomplete.

A provided music or a user-chosen music from SoundCloud can be used as texture.

Texture size is bufferSize x 2. (e.g., 512 x 2). index, .75 : music samples It is a buffer refreshed along time, but the precise working seems very system dependent plus the synchronization with image frames is not guaranteed. So drawing the full soundwave or using a naive sound encoding of image won’t be faithful (or possibly only for you). index, .25: FFT of the buffer x / bufferSize = f / (iSampleRate /4.) , example here. iSampleRate give the sampling rate, but it seems incorrectly initialized on many systems: if precision is important, try manually 44100 or 48000.

gogins commented 2 years ago

To summarize what I have perhaps mistakenly gleaned:

I'm still not sure I have this right or if it is really working. But I must proceed to more interesting music.

gogins commented 2 years ago

VENDOR: WebKit RENDERER: WebKit WebGL GL_VERSION: WebGL 2.0 (OpenGL ES 3.0 Chromium) SHADING_LANGUAGE_VERSION: WebGL GLSL ES 3.00 (OpenGL ES GLSL ES 3.0 Chromium)

This is quite frustrating. Strategy:

gogins commented 2 years ago

I just learned something. In Shadertoy if one clicks on the "Compiled in x seconds" label, a window pops up containing "Translated Shader Code." It specifies a GLSL version. The code is somewhat different than what one views in the online editor. In particular the samplers are structs. One does not access a struct as a whole, but using the location of each field, e.g. mystruct.field1, mystruct.field2.

gogins commented 2 years ago

I failed to declare a version of GLSL matching the code (and browser). This means that features not implemented in version 1 should fail -- I think. I added the declaration and perhaps now things will go better. I think some of the fog is beginning to dissipate. This is why the Shadertoy code editor can display translated shader code -- the code in the editor has to be fiddled with and versioned before it can be run.

gogins commented 2 years ago

Some slight progress. I have proved that #version 300 es must be the first line of all shader code. This makes the texelFetch and texture2D errors go away. In addition attributes are no longer supported and are replaced by in parameters. But the audio is still not coming in.

Perhaps there are other version incompatibilities regarding textures, samplers, or bindings.

gogins commented 2 years ago

https://madethisthing.com/iq/piLibs-JS is iq's library that wraps webgl2. This is used in Shadertoy.com and I think this code is a reliable guide on how to do things right. So far, everything I am doing seems right. But so far I see no code for loading audio in textures or samplers.

Although the Shadertoy.com code is not visible, this is a really great resource from somebody who seems to know a lot and do things well.

This offers some historical context:

Aside from his contributions to films and games, Pol [Jeremias] co-founded Beautypi LLC with Iñigo Quilez, a company dedicated to bringing computer graphics everywhere. In 2013, Beautypi released Shadertoy.com, a global social network with tens of thousands of contributors that enables graphics enthusiasts to create and share computer graphics knowledge. Today, Shadertoy is one of the biggest repositories of computer graphics experiments, ideas, and projects. In 2020, Beautypi released Memix, a software to improve video conferencing using computer graphics technology. Memix was acquired that same year by Mmhmm Inc.

gogins commented 2 years ago

It's clear Quilez is secretive about Shadertoy.com source code even though the Web site hosts open source shader toys. He probably works on proprietary libraries. At this point my options are narrowing.

gogins commented 2 years ago

Maybe more complete demos: https://github.com/WebGLSamples/WebGL2Samples. This is the repository that ended up solving my problem by providing complete JavaScript code for properly setting up sampling for a texture.

gogins commented 2 years ago

It is necessary to usegl.getExtension("EXT_color_buffer_float"); in order to verify that audio has been correctly imported into a floating-point texture. This is a known issue.

gogins commented 2 years ago

This is fixed. I had to set sampler parameters rather than texture parameters. Also, to use nearest filtering not linear filtering. Things are starting to happen although mapping texture to display is still not correct.

gogins commented 2 years ago

Implementation of a neural network. Inverse Lyupanov Journey, moving images can be sampled as they pass through a line, plane, or volume. Inside the Mandelbulb, rather beautiful. Fractal mosaic. Fractal mosaic 16. Bacterium (should be plural). Orbit trap periods. Mouse julia. Fast deep Mandelbrot zoom. Mandelbrot/Julia, could move a sampling line through the picked Julia sets. Interactive Mandelbrot zoom, straightforward. Sierpinski pyramid.

gogins commented 2 years ago

Moving images/objects can be sampled as they pass through a line, plane, or volume in view coordinates, or relative to the object itself.

In a shader, there are no vertices as such, it's all about the texels. These are simply color values in the viewport, vec4(r,g,b,a). If the texture is gl.RGBA32F, which mine will be, the values will be very precise.

So to be clear, we are not sampling from objects, but from the state of texels in a region of the viewport. That region could be of any form. Alpha will normally be 1, so we are really dealing with vec3(r,g,b).

gogins commented 2 years ago

When a texel crosses the sampling point, if it exceeds a threshold of some sort, it triggers a Csound event. When the value in that texel falls below that threshold, the event is turned off.

The shape of the sampling region would most intuitively be understood if a vertical or horizontal line through the center, or one of the borders, of the viewport. Then there are possibilities for sampling. One event could be one texel, or several texels.

We are not mapping time or duration because we are a dynamical system ticking through time on every view rendering. So a single texel could represent (instrument, key, velocity). Obviously, there is way too much data in the image, the image bandwidth far exceeds the Csound event bandwidth.

If the events are grains, the event bandwidth can be much higher, then we are in Xenakis land.

Alternatively, the sampling line could be mapped to N instruments, then width/N texels are available for mapping to the events. For Csound instruments this is excessive, but this data could go into function tables for a variety of uses.

For Csound instruments, the line can be sampled at a lower level of detail, i.e. filtered, to reduce the bandwidth.

Either the GLSL code can directly return sampled texels that JavaScript code will then map to Csound events, or the values of the Csound events can be directly computed in the shader and returned in a significantly smaller array, this seems likely to be considerably more efficient and to give more of the computer power to the Csound performance and to the shader itself.

Another way of reducing the bandwidth is to produce only a few texels above the sampling threshold, e.g. only a few bright sprites or something.

Of course for a fixed image, the sampling line or plane can move through the image.

gogins commented 2 years ago

We need either an image that moves continuously through the viewport, or an image that is interactively controlled by the user and sampled by passing a line or plane through the image. On my desktop screen:

canvas.height: 1771
canvas.width:  3544

This is pretty high resolution and very high bandwidth.

If we map width to the piano range of 88 semitones we have 40 texels per semitone. I think it will be simplest, and fast enough, to simply average each 40 texels. Then we are comparing current sample to prior sample and threshold.

gogins commented 2 years ago

Considering the limitations of GLSL, and the overhead of copying data into and out of the GPU with SSBOs or textures or whatever, it is now seeming optimal to just read the pixels in the rendered image frame in the canvas along the sampling line. This is similar to what I did in "Unperformed Experiments." Probably an experiment is required to compare overheads, but I can start with just reading the image to get a piece going. However, just doing this in the obvious way hits performance with pipeline stalls, it's necessary to use a pixel buffer object (PBO).

gogins commented 2 years ago

https://stackoverflow.com/questions/24495410/how-to-read-a-pixel-depth-value-without-stalling-the-pipeline#24496636

gogins commented 2 years ago

Did a lot of Googling and this is an inherently messy and tradeoff-ridden topic. Right now, it looks like pixel buffer objects are the best solution. The GPU will write to the PBO and then the CPU can test to see if the PBO is ready. Otherwise the pipeline will still stall. There will need to be at least one fence (one for the PBO, and maybe one for the PBO to Csound processing).

gogins commented 2 years ago

(Groan...) Now there is WebGPU. Not yet ready for prime time, but might make sense anyway. Doesn't seem to be available on macOS in my versions of Chrome or Safari.

gogins commented 2 years ago

There are then two synchronization issues, within the very multithreaded fragment shader invocations, and between the GPU and the CPU.

For now it seems best to use gl.readPixels asynchronously to obtain one row of the current frame buffer.

The real issue for me is synchronization between the fragment shader invocations and reading the SSBO in the CPU. This summarizes the situation. Memory barriers do this.

gogins commented 2 years ago
gogins commented 2 years ago

Mapping issues:

gogins commented 2 years ago

My mapping algorithm is no good. Here's another try.

The problem is still the high bandwidth of the visual signal versus the low bandwidth of the Csound event signal. We want the relationship between the visual animation and the Csound audio to be reasonably apparent to the viewer.

I think the best approach is still to filter the bottom row, or middle row, of the canvas down to a resolution the same as the number of MIDI keys to be played. This could be down with simple downsampling, averaging before downsampling, or Gaussian downsampling.

After that, filter the loudest N events from the downsampled buffer. But this only makes sense if the canvas can be sampled at a high enough rate. I will determine that rate before I decide to implement this algorithm.

If the rate is not high enough, or if I implement it all just fine but it still doesn't seem right, I will drop sampling GLSL generated visuals and change to sampling WebGL objects (vertices), which could be done with Three.js or p5.js.

gogins commented 2 years ago

There may still be a problem with the async read of the canvas. Making it async may mean that the reads are piling up. I will log them also. And indeed:

prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.708 parent frame: 997 current frame: 997
prototype_score_generator.html:1162 render_scene:                      time: 17.71109999999404 frame: 998
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.72209999999404 parent frame: 998 current frame: 998
prototype_score_generator.html:1162 render_scene:                      time: 17.73059999999404 frame: 999
prototype_score_generator.html:1162 render_scene:                      time: 17.74540000000596 frame: 1000
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.75469999998808 parent frame: 999 current frame: 1000
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.755900000005962 parent frame: 1000 current frame: 1000
prototype_score_generator.html:1162 render_scene:                      time: 17.7615 frame: 1001
prototype_score_generator.html:1143 translate_sample_to_csound_events: time: 17.77409999999404 parent frame: 1001 current frame: 1001
prototype_score_generator.html:1162 render_scene:                      time: 17.78140000000596 frame: 1002
prototype_score_generator.html:1162 render_scene:                      time: 17.7955 frame: 1003

But I'm not sure this is a real problem.

gogins commented 2 years ago

The idea is to blur the middle row of the canvas such that each of the plurred pixels is mapped to one MIDI key number, and each of the bright enough pixels is played as a Csound "note on" event. Events don't carry their own state, but are stored in queues by state: sampled_events, on_events, playing_events, and off_events.

Before rendering, all queues are cleared.

On each Nth rendering frame:

  1. All pixels in the downsampled row are translated to Csound events.
  2. All events that are not already playing and that are loud enough are played.
  3. Events that are playing continue playing until a new event for the same MIDI key comes in that is not loud enough, then the playing event is turned off.

If the loudness threshold is high enough it doesn't seem necessary to set a maximum number of voices, although that should also be possible.

This seems to be working more or less. The reverb is clattering so I will examine the level and release envelope situation.

gogins commented 2 years ago

The instruments need to be fixed to handle note on/note off pairs.

gogins commented 1 year ago

There is an issue in that the Emscripten toolchain does not permit for WebAssembly code or JavaScript code to read the native filesystem. There are workarounds but I do not like them.

This prevents Csound from using #include. For now, I will just include all Csound orc code in the HTML file.

gogins commented 1 year ago

Things are better but ZakianFlute is still piling up instances.

gogins commented 1 year ago

Synchronizing score generation in loops/segments... the issue is that the system clock and Csound's score time may drift apart.

The C prototype for the callback is: void(* channelCallback_t) (CSOUND *csound, const char *channelName, void *channelValuePtr, const void *channelType). In the preset context probably this would do: (channel_name, value) => {}.

gogins commented 1 year ago

Using a JavaScript timer should be adequate for all cases I can foresee. The timer can be reset with different durations depending on when the generated segment will end. There could of course be any number of such "tracks" going on at the time, and they also can be nested.

gogins commented 1 year ago

There is a bug in the Csound infoff function, truncates p1 to int. I will probably have to write some test cases and/or try to fix the bug.

gogins commented 1 year ago

I think I can use this if it is public or I can make it public:

   * Kills off one or more running instances of an instrument identified
   * by instr (number) or instrName (name). If instrName is NULL, the
   * instrument number is used.
   * Mode is a sum of the following values:
   * 0,1,2: kill all instances (1), oldest only (1), or newest (2)
   * 4: only turnoff notes with exactly matching (fractional) instr number
   * 8: only turnoff notes with indefinite duration (p3 < 0 or MIDI)
   * allow_release, if non-zero, the killed instances are allowed to release.
   */
  PUBLIC int csoundKillInstance(CSOUND *csound, MYFLT instr,
                                char *instrName, int mode, int allow_release);
gogins commented 1 year ago
gogins commented 1 year ago

To finish the prototype score generator:

gogins commented 1 year ago

Try:

gogins commented 1 year ago

There were problems on Android. At least for the tablet, I fixed the problems by using highp instead of medium and not using the pixel density. The tablet doesn't really have enough oomph and there are dropouts and glitches, but the visuals are just as good and everything is basically working.

gogins commented 1 year ago

For Cloud Music No. 2, use the existing visualizer and Csound orchestra. Replace the fixed Csound score with a JavaScript score generator. Instead of a Silence score, use JavaScript template strings to format i statements for the existing orchestra, which sounds quite good.

gogins commented 1 year ago

I am searching for JavaScript code for various chaotic dynamical systems that can be mapped to various chord spaces.

gogins commented 1 year ago

It was necessary to add one method StrangeAttractor::iterate_without_rendering before I could proceed. I also added convenience functions to get the state of the attractor in normalized form.

gogins commented 1 year ago

Added code to copy and paste controls state to and from the system clipboard.

gogins commented 1 year ago

Two bugs: