Closed Andersama closed 5 years ago
It's a great idea, but the implementation would take some thought.
First question would be what format the audio data would need to be in. Could be either a raw waveform, or a frequency graph. Most use cases would probably want a frequency graph, I think. Probably shouldn't rely on the shader to do an FFT on a waveform. Possibly something that can be toggled by the user?
Second, how does the shader code see this data? HLSL obviously doesn't have any native support for audio data. It would probably want to be a texture1d, I guess? Might need to use something like HLSL annotations to tag a particular uniform texture1d
as being intended to be populated with audio, if possible.
Third, which audio data should be used? OBS has a fairly complicated mixer pathway, where each audio source can be sent to 6 different channels, and each of the 6 channels can be selected to go to the stream, the output file, or neither. Somehow you'd have to decide which bits of that process should be captured and sent into the shader. I also haven't looked into how well OBS's API supports accessing all those audio buffers from a plugin, so there might need to be some API work in OBS to expose that.
Excellent idea, but rather complex to implement.
Ooh, I do see that you added annotation support to OBS a few months ago! That will definitely help with things like this once it makes it into an OBS release. obsproject/obs-studio@af6708691269066782388748c568802fb3704d52
@nleseul I've actually been hoping to talk to you, yeah I've rewritten your plugin and I've already started trying to make some neat features w/ annotations. Feel free to check it out.
(it can handle audio)
https://github.com/andersama/obs-studio/tree/shader-filter-cpp
Oh, and I guess to clarify, the annotation determines the audio's transformation, whether it's raw audio (the waveform), and fft etc what window function to use. And is uploaded as a texture2d, where the x dimension represents the array of audio data and the y represents the channel.
Though looking at this now I think I should probably give some control over the channels used, just to maybe save on some processing.
Clearly this is the part where variables are being updated, it looks as if the structure is arbitrary, thought it might be interesting...