wwmm / easyeffects

Limiter, compressor, convolver, equalizer and auto volume and many other plugins for PipeWire applications
GNU General Public License v3.0
6.57k stars 270 forks source link

[request] ability to delay the spectrogram/meters by ... frames / milliseconds #3312

Open magicgoose opened 3 months ago

magicgoose commented 3 months ago

Would help to be able to force spectrogram/meters to be in sync with what's physically being played at the end of the pipeline, especially when using a Bluetooth output device.

If this can be saved per device (just like effects) then that's extra convenient.

Also (not 100% sure), some bluetooth devices report their latency, might be good to use that as the starting value.

wwmm commented 3 months ago

Would help to be able to force spectrogram/meters to be in sync with what's physically being played at the end of the pipeline

The spectrum is at the end of the pipeline. We process the buffers as soon as PipeWire sends them. The thing is that the speed at which those buffers come also depends on the target server latency and whatever else is affecting PipeWire's decisions. And to make things worse a client like EasyEffects does not have the necessary information to synchronize with what the players are doing. Maybe PipeWire has this information. But not us. So it is unlikely we can do something about it.

magicgoose commented 3 months ago

Thanks for analysis, understandable. Still a fully manual knob might possibly make sense? something like passing the snapshots of (spectrum fft info) through a queue of N items, and the user just has to type N in.

wwmm commented 3 months ago

Thanks for analysis, understandable. Still a fully manual knob might possibly make sense? something like passing the snapshots of (spectrum fft info) through a queue of N items, and the user just has to type N in.

Maybe an average over a time window. The correct thing to do would probably be to calculate the fft once in a larger window. But this would have performance side effects.

violetmage commented 2 months ago

I think magicgoose is more concerned with the synchronization of when a sample of the spectrum is displayed and when the corresponding sound reaches their ears.

It seems to me that a simple "delay line" type of knob would be sufficient. A FIFO from the end of the pipeline into the spectrum with a user configurable length or something. This would probably require changes to the pipeline manager, as the audio path would need to branch at the end of the pipeline; one branch going into the delay line towards the spectrum, and the other directly out the system audio device. Luckily pipewire is very good at this sort of thing, so it should be possible at least.

wwmm commented 2 months ago

It seems to me that a simple "delay line" type of knob would be sufficient.

I think it is not so simple. But if it is this can be tested now. We have a delay plugin that can be put in the pipeline. And the stereo tools can also add delay.

But based on what I see on my computer I do not think that we need delay. When listening to a song with easy to notice bass I can see that the spectrum jumps exactly when the bass kicks.

violetmage commented 2 months ago

No, that's exactly the point, it's a system specific thing. Sometimes output devices have quite significant latency that we can't do anything about (bluetooth, etc.).

The idea is to give the user a knob so that they can manually adjust the delay of just the spectrum (in a perfect world, all graphical audio feedback) with respect to the rest of the pipeline. This way, in the case where by default the spectrum might react to a kick, but the user doesn't hear it until many milliseconds later, the spectrum itself can be delayed that many milliseconds so the visual cue matches with the perceived sound.

i.e. the output device has a "transmission time" (inherent delay) of 800ms, so the sound is only heard by the user 800ms after we send it out the system device. The spectrum is not delayed by the same amount because it goes on the screen instead of through the speakers. So the audio the spectrum is reacting to is 800ms ahead of the audio the user is actually hearing irl. Thus, the user wants to delay the audio into the spectrum by 800ms, and not delay anything else (hence the branching path stuff).

from a pipeline perspective, it would be like this:

ee_sink -> effects_pipeline -> (...) +-> system_output
                                     |
                                     \-> delay_line -> spectrum + level_meter

where delay_line is configured by a "spectrum delay" or "output device latency" on a per-profile basis

edit: formatting

edit 2: obviously, this is kind of niche and only important for users doing live sound monitoring, but of course it's nice to have

wwmm commented 2 months ago

obviously, this is kind of niche and only important for users doing live sound monitoring, but of course it's nice to have

Not necessarily niche. But it is hard to know if this is solving the same problem the user is actually seeing. For the delay between EasyEffects output to be big enough for the users to feel the spectrum is not being animated at the right time the delay would have to be so big that it would also be affecting other things. So I am not sure if delay is really the solution.

In the case of delays like the one from bluetooth devices PipeWire is supposed to do the compensation. And a recent enough pipewire will also compensate these kind of delays when EasyEffects in the middle. So if a delay is the problem it would have to be a delay that even PipeWire can do nothing about. And this would really be niche.

violetmage commented 2 months ago

I always thought that the synchronization was only automatic for the audio side of things, and that if you're trying to synchronize other stuff with the audio (like a video stream via wayland), an app needs to look at the reported latency of the node by pipewire and compensate whatever else it's doing internally. Unless the plugins are somehow able to do this transparently from within the wrapper, I didn't think easyeffects supported automatic latency compensation.

i.e. a video player application would require writing actual code to read the latency value and sync the video to the audio (unless the video is going through pipewire somehow), whereas an audio player playing through multiple speakers would not have to do anything special in order for the sound to come out of the speakers at the same time.

I might have this all backwards, but i'm pretty sure this is right, given that mpv seems to implement this explicitly

wwmm commented 2 months ago

I didn't think easyeffects supported automatic latency compensation.

PipeWire does it for us. Imagine the video playback compensation you explained above. Until a few months ago having EasyEffects in the middle would break that compensation because PipeWire was not forwarding the latency offset through null-sink's monitor ports. The ones we record audio from. But this was fixed. Synchronization of audio and video works fine with EasyEffects in the middle now.

In EasyEffects we update the spectrum as soon as we receive audio buffers. And if I understood everything right these audio buffers should be passing through EasyEffects with their time axis already compensated. So I am not sure if adding delay is the solution to this issue. It may seem to the user that delay is the answer but maybe it is just hard to describe in words what is happening there.

Now I remember that the latency offset can be easily set in Pavucontrol. So that is another way to test if this is really the solution. And if it is setting it in Pavucontrol is probably better.

wwmm commented 2 months ago

Unless the plugins are somehow able to do this transparently from within the wrapper,

Each plugin port receives that latency offset PipeWire uses for compensation. We just have to add the extra plugin latency and forward the total value. We have already implemented this https://github.com/wwmm/easyeffects/blob/620fa1436a0c7eeda346e16236a2c8a60da5bcb0/src/plugin_base.cpp#L155.

violetmage commented 2 months ago

Yes, but the part you've forgotten is that while the source player may be synchronized between it's audio and video, since we passed along the latency, and maybe even our plugins, the spectrum is not. The spectrum needs to implement the syncing because it's taking in audio and displaying video; two different, unsynchronized mediums (unless pipewire is handling the video, but it's not in this case; wayland is).

As you've said, the spectrum is processed the moment it's possible. It's not possible for the video synchronization to be done by pipewire, and even if it was, that would mean the entire UI of whatever it tried to synchronize would be delayed by whatever latency the output device reports, since pipewire couldn't know which parts are based on the audio and which are just interactive. The problem is that the output audio is inherently delayed, at a hardware level: pipewire would have to be able to time-travel and apply a negative delay in order for it to sync up the audio with the video.

I'm realizing that this is actually a per-plugin thing (for the purposes of easyeffects, the spectrum is basically a plugin), as in each plugin would need to implement delaying it's UI from the audio based on the reported latency individually, within the plugin. I could imaging the plugins that we wrap implementing this, but the spectrum and level meter don't.

wwmm commented 2 months ago

The problem is that the output audio is inherently delayed, at a hardware level: pipewire would have to be able to time-travel and apply a negative delay in order for it to sync up the audio with the video.

PipeWire does not have to time travel. It reports to the video player the latency offset in the audio stream and the video player delays the video streams accordingly. Usually the delays that are easy to notice by ear are the ones related to the buffer configuration done by the server and the driver. This one PipeWire knows. What would be outside of its reaches would be a delay coming from the hardware electronics. If this one were as big as you are thinking properly watching video without an out of sync audio would be impossible. But this isn't what happens even on bluetooth.

Yes, but the part you've forgotten is that while the source player may be synchronized between it's audio and video, since we passed along the latency, and maybe even our plugins, the spectrum is not.

I still think you are immensely overestimating the time difference between the FFT being calculated (buffers being received) and the spectrum animation being displayed in the screen. I still doubt this is what the user is actually seeing. But in any case let's consider the possibilities. Either the user is seeing the spectrum movement before the sound is heard or the sound is heard before the spectrum is displayed in the screen.

If the user hears the sound before seeing the spectrum the most likely explanation is the computer being too slow to compute the FFT and displaying the spectrum it in the screen. Adding delay to the audio in EasyEffects or setting a latency offset in pavucontrol may solve this. But it is not reliable because the reason for the mismatch depends on how busy the CPU is. And I do not think it should be done in EasyEffects.

If the user sees the spectrum movement before the audio I can think of two reasons. Either the audio hardware/driver has already some kind of audio delay that is no being compensated. Honestly I do not think it is the case. Or PipeWire is sending buffers at a pace that does not necessarily matches the audio stream playback time axis. This seems more likely. Adding delay to the audio stream does not solve any of these cases.

The only solution I see would be to delay the spectrum drawing. How viable implementing this would be I do not know...

violetmage commented 2 months ago

Yes, I am talking about the second situation you described. All that would be necessary is adding either a delay to the drawing of the spectrum, or simply a delay to the audio coming into the spectrum, branching off of the main pipeline instead of going through the spectrum.

I think another way to explain it is to think of the spectrum as another video player, that needs to be synced with the source video player and the audio as heard by the user. The source player will delay it's video by the reported latency to match the audio, but the spectrum currently doesn't, so it will look visually like it's 'ahead' of the source player.

The point is that there is hardware out there that has truly terrible latency built in, and there's nothing we can do about that. Hence, compensating delay.

Also, I remembered that colloquially this is referred to as A/V Sync in most places, so perhaps that will aid in googling stuff about this.

magicgoose commented 2 months ago

yes @violetmage has understood exactly what I tried to ask! (sorry guys English is not my first or best) :smile_cat:

wwmm commented 2 months ago

I think another way to explain it is to think of the spectrum as another video player, that needs to be synced with the source video player and the audio as heard by the user. The source player will delay it's video by the reported latency to match the audio, but the spectrum currently doesn't, so it will look visually like it's 'ahead' of the source player.

I understood what you are trying to say from the beginning. That is not what I am questioning. I still do not see evidence that the time difference between the spectrum drawing and the audio being played is actually as big as it is being suggested.

Depending on the pipewire latency and sampling rate the number of buffers per second we receive may be different. What will have some effect over the spectrum animation speed. This doesn't necessarily mean that there is a difference between audio and spectrum events in the time axis.

In short what I am saying is that we may add a delay just to find out that it did not do what was expected.

It is a little out of the topic but the spectrum drawing and a video stream aren't in a completely equivalent situation. I think this is causing some confusion here. A stream of video frames has timestamps for the presentation of each frame. A stream of audio frames has its own sequence of timestamps for the presentation of each frame. The video player has to synchronize both streams.

In our case the spectrum isn't an independent stream of video frames. The spectrum frames depend on the arrival of audio frames to be created. It is still possible to have a lack of synchronization due to some of the factors I talked about before. But as there is an unavoidable correlation with the audio frames the spectrum frames arent as free to drift away from the audio as video frames are in a video player. That is why I think this delay is being overestimated and the real problem may the something else.

All that would be necessary is adding either a delay to the drawing of the spectrum, or simply a delay to the audio coming into the spectrum, branching off of the main pipeline instead of going through the spectrum.

Branching the pipeline will require too much complication. Delaying the drawing is a more reasonable approach.

wwmm commented 2 months ago

The question now is the best way to add this delay. The current implementation relies on https://github.com/wwmm/easyeffects/blob/620fa1436a0c7eeda346e16236a2c8a60da5bcb0/src/effects_box.cpp#L450C3-L450C31 for the drawing scheduling. It does not seem gtk_widget_add_tick_callback is going to play nice with a delay control. Maybe inserting silence in the spectrum buffer before the FFT computation will be easier...

violetmage commented 2 months ago

Couldn't there be a FIFO queue added in somewhere around this bit instead?

https://github.com/wwmm/easyeffects/blob/master/src/effects_box.cpp#L344

Edit: you could alternatively make the queue out ui::chart instead of spectrum_mag, but I'm assuming spectrum_mag uses less memory

violetmage commented 2 months ago

Also, since I still haven't managed to fully understand pipe_manager, I would be delighted to know why branching in the pipeline would be difficult.

wwmm commented 2 months ago

Also, since I still haven't managed to fully understand pipe_manager, I would be delighted to know why branching in the pipeline would be difficult.

It is one more layer of complication in the code that handles links. And this code is already complicated. On top of that making a branch only makes sense if we create another filter responsible for the delay. Nothing impossible. But it is more code to maintain and as the spectrum plugin works in passthrough by default anyway the delay could be implemented inside the spectrum plugin for example. No need for branches or additional plugins.

Couldn't there be a FIFO queue added in somewhere around this bit instead?

https://github.com/wwmm/easyeffects/blob/master/src/effects_box.cpp#L344

This code is called when gtk decides to invoke the tick callback. It does not look like a queue there will work very well. But besides the location there is something else to consider that is how the delay count is going to be done if this is implemented through a queue. We do not want to delay every buffer or every callback call. It has to be a delay that shifts the whole "spectrum stream" by the same amount of time. Once the pipeline is paused or stopped something will probably have to be done in the queue managing the delay.

That is why I've been thinking about changing the spectrum plugin and introducing the delay there through the proper amount of silence. In this approach we would be modifying the audio content that the spectrum uses to calculate the FFT . There would be no need to fight gtk about the widget update frequency.

violetmage commented 2 months ago

So basically it is the branching path idea, except implemented within the spectrum code itself, using a FIFO queue on raw audio frames.

I don't think you need to inset silence, just make the rendering code wait until the queue's size reaches some adjustable threshold before it renders something. If the threshold is freely adjustable it probably wouldn't even be that hard to gracefully handle dynamic latency switches.

wwmm commented 2 months ago

using a FIFO queue on raw audio frames.

No FIFO queue. Just the silence like delay plugins do should be enough.

I don't think you need to inset silence, just make the rendering code wait until the queue's size reaches some adjustable threshold before it renders something.

I do not think that using the queue size as control parameter will be good. The control has to be the amount of desired delay. Based on that the queue would grow accordingly. One problem is how big this queue can potentially be... And another is how the counting of time will be done.

If the user starts to play something and pauses right after it does not make sense to keep the delay timer going. So an ordinary time interval counting based on when "the first array came" is not the solution here. Just using the number of arrays in the queue is also bad because a queue on the rendering side is a queue of FFT arrays. And the amount of these arrays will be influenced by the size and amount of audio buffers. What changes with latency and sampling rate. So the same amount of FFT arrays could be related to different time intervals.

wwmm commented 2 months ago

No FIFO queue. Just the silence like delay plugins do should be enough.

By this I meant not accumulating buffers in a list like it would have to happen on the rendering side. Some kind of ring buffer is unavoidable. But the last patches to the spectrum have already done something in this direction.

violetmage commented 2 months ago

Oh, I understand now. You mean that every time playback begins, the spectrum would feed itself n milliseconds of silence, then start processing the audio. That makes sense.

The part I'm not sure about is how you would be able to insert the silent period while still always forwarding audio from the input port to the output port immediately.

Where would the delayed "extra" audio be held in waiting? By pipewire, even though this all is happening within the spectrum code?

Edit: you answered the main part of my question, but I'm still a bit confused about how the delay would be managed.

wwmm commented 2 months ago

The part I'm not sure about is how you would be able to insert the silent period while still always forwarding audio from the input port to the output port immediately.

I am not sure about how well it will work for arbitrarily large delays but for the sake of argument imagine an array of floats with 1024 elements. And that we do a translation to the right by let's say 128 elements. So sample zero now became sample 129. These first 128 would become silence. And the last 128 samples of the array that now are out of the array would be saved and becoming the first 128 samples the next time the process callback is called.

The amount of time delay would be something like number of samples translated / sampling_rate. Large delays would require a large array to save the data that was moved out of the source array. And making the delay controllable through the amount of shifted samples will probably be a little annoying. But at least inside the plugin we are working directly with the time axis of the audio stream.

The current spectrum code does something similar for other reasons. So I am not sure yet if mixing both things will end well... A queue on the rendering side would have the advantage of not requiring changes to the spectrum plugin. But I still could not visualize how the delay time window would be managed in this case.

wwmm commented 2 months ago

Oh, I understand now. You mean that every time playback begins, the spectrum would feed itself n milliseconds of silence, then start processing the audio. That makes sense.

Avoiding it feeding more delay every time a playback starts is doable. The question is how to know the delay is what the user wants or that it is at least close. I have to meditate a little more about this...

violetmage commented 2 months ago

Isn't the reported latency value from pipewire of easyeffects' target output just that? It seems like it would be trivial to test if the value is reasonable, and if not, ignore it.

Now that you mention it, would't around here be a good spot to just add a buffer of audio? It's already mono, and the passthrough is already done at that point. It seems to me that putting a buffer there would also allow the synchronizing code to remain untouched, if I understand correctly.

In fact, I think there could be a global (ideally per-profile) "A/V sync" setting in easyeffects, where setting a positive value just adds to the spectrum's delay, and a negative value instead adds a delay filter to the very end of the pipeline, even after the spectrum and level meter. There would be an "auto" mode for this setting, which just sets it to whatever pipewire reports as the latency value for the device easyeffects is set to output to.

Thinking about it, I have a feeling that casual users will be likely to understand an "A/V sync" style option in easyeffects, since that's a common setting in home theater stuff.

violetmage commented 2 months ago

TL;DR change the issue title to "add a/v sync setting to easyeffects"

wwmm commented 2 months ago

Isn't the reported latency value from pipewire of easyeffects' target output just that? It seems like it would be trivial to test if the value is reasonable, and if not, ignore it.

It crossed my mind to try to get this value instead of adding a configuration setting. But I am not sure if there is a value there if the user does not set a custom offset in Pavucontrol or similar software. In any case this is a different problem. What I was trying to say before is that in the case of a rendering side solution I am not sure about how to make sure the amount of delay applied is the one desired. The latency information that comes from PipeWire would be the correction that the user wants. Not the one we actually applied. Measuring how much delay we actually added in the rendering side and controlling its size is still a interrogation in my mind. Assuming that problem is solved this way.

Now that you mention it, would't around here be a good spot to just add a buffer of audio? It's already mono, and the passthrough is already done at that point. It seems to me that putting a buffer there would also allow the synchronizing code to remain untouched, if I understand correctly.

Maybe. That is one of the parts where I have to be careful to not mess what is being done there.

TL;DR change the issue title to "add a/v sync setting to easyeffects"

I think that the current title is fine. Audio and Video synchronization work fine for Video players. This discussion is really only about the spectrum. And for the setups where theoretically there can be a mismatch. Unfortunately I do not have a hardware that manifests it. My bluetooth headphones do not have any noticeable delay when compared to the spectrum. Same thing on my onboard card.

In fact, I think there could be a global (ideally per-profile) "A/V sync" setting in easyeffects, where setting a positive value just adds to the spectrum's delay, and a negative value instead adds a delay filter to the very end of the pipeline, even after the spectrum and level meter. There would be an "auto" mode for this setting, which just sets it to whatever pipewire reports as the latency value for the device easyeffects is set to output to.

Why? There is no need of delay for video players. And adding delay to the audio stream would make the supposed mismatch to the spectrum drawing worse.

violetmage commented 2 months ago

Pipewire reports a non-zero latency value for any devices whose driver reports a latency, estimated or calculated. I'm not sure if pipewire distinguishes between devices that don't report latency and devices that report 0 latency. In either case, it shouldn't be too complex to handle; just pretend the latency is 0 for devices that don't report anything, otherwise try to use the reported value.

In terms of where to add a buffer, that's the best I could come up with so far.

The only reason I mention the "A/V sync" term is that it seems to be in use in other places with some consistency, so I figured perhaps since that's what this seems similar to, it would make more sense to call it that. Even if at the moment the only thing we use the value for is the spectrum offset, it would mean easyeffects could expand on the functionality without confusing anyone.

About "A/V sync", since easyeffects sits in between sources and the output, if the user switches it out of "auto", for example if they know that their device reports an incorrect value, they could manually enter the real value there, and we could pass along the manual value instead of auto detected value to the source player. This is advantageous for the user because it allows adjusting the real a/v sync without even messing with the pipewire configuration. Additionally, in the future it would be possible to enable the use of negative offsets for cases where the video is delayed behind the audio (instead of in this case, where the audio is delayed behind the video), by adding the delay at the very end I mentioned. The negative offset case would need to be handled in easyeffects like this because AFAIK pipewire doesn't have any concept of a/v sync or negative latency values.

Like you've noted, the delay is generally not needed, but for people running media PC's on their home theater systems, where a/v receivers and amplifiers and switchers and overly slow tv display scalers are all piled together, the ability to easily and arbitrarily add positive or negative a/v sync becomes very useful. Of course, it's not a very important use case, but this feature is purely aesthetic anyway, so it bears mentioning.

violetmage commented 2 months ago

The amount of time delay would be something like number of samples translated / sampling_rate. Large delays would require a large array to save the data that was moved out of the source array. And making the delay controllable through the amount of shifted samples will probably be a little annoying. But at least inside the plugin we are working directly with the time axis of the audio stream.

Intuitively, this sounds like the right approach to me, whatever that's worth.

wwmm commented 2 months ago

Like you've noted, the delay is generally not needed, but for people running media PC's on their home theater systems, where a/v receivers and amplifiers and switchers and overly slow tv display scalers are all piled together, the ability to easily and arbitrarily add positive or negative a/v sync becomes very useful. Of course, it's not a very important use case, but this feature is purely aesthetic anyway, so it bears mentioning.

Usually in this case the user is concerned about the lack of synchronization in video players. And this one can already be corrected through a latency offset in Pavucontrol.

About "A/V sync", since easyeffects sits in between sources and the output, if the user switches it out of "auto", for example if they know that their device reports an incorrect value, they could manually enter the real value there, and we could pass along the manual value instead of auto detected value to the source player. This is advantageous for the user because it allows adjusting the real a/v sync without even messing with the pipewire configuration. Additionally, in the future it would be possible to enable the use of negative offsets for cases where the video is delayed behind the audio (instead of in this case, where the audio is delayed behind the video), by adding the delay at the very end I mentioned. The negative offset case would need to be handled in easyeffects like this because AFAIK pipewire doesn't support negative latency values.

I think different things are being mixed. It is one thing to do correction for video players. Another for the spectrum. Correction for video players has been possible for years. There is no need to insert this in EasyEffects. The correction for the spectrum that may be needed in some cases and there is no way to do it now.

instead of in this case, where the audio is delayed behind the video

In our case it is not possible to have the audio behind the spectrum because the spectrum needs the audio to be produced. And in the case of the video players you can correct this one too. I tested this today.

violetmage commented 2 months ago

The reason I was interested in the idea of a generalized "A/V sync" is because having it in easyeffects might make the option more accessible to casual users, simple as that.

Overall you're right, I just wanted to consider the possibility.

About the implementation, would something along the lines of having a std::array<std::array<float, n_bands>, delay_length> spectrum_sync_mono_buffers type of thing be heading in the right direction?

Also, what's the reason that all of the buffers are of size n_bands, if you don't mind my curiosity?

edit: grammar

violetmage commented 2 months ago

Found the pipewire page where I read about the latency on devices:

https://docs.pipewire.org/page_man_pipewire-devices_7.html

see "ALSA properties" - "node properties" - latency.internal

I don't know how the latency values reported there interact with the node.latency value, or even if device nodes have a node.latency, but I suppose it shouldn't be too difficult to test if someone has a device that reports a non-zero systemic (that's how the pipewire docs call it) latency.

I should be able to put more time into this later in the week and check if anything I have reports a latency.

wwmm commented 2 months ago

About the implementation, would something along the lines of having a std::array<std::array<float, n_bands>, delay_length> spectrum_sync_mono_buffers type of thing be heading in the right direction?

If we implement through a queue on the rendering side it is inevitable to change the signal with the extra information about the duration. The question is if just the duration of each array or already the whole delay duration.

I don't know how the latency values reported there interact with the node.latency value, or even if device nodes have a node.latency

This one is probably just the latency the node is requesting and nothing more.

I did some tests earlier today with the latency offset control in Pavucontrol. My conclusion is that PipeWire only uses it to pass as information to video players. I increased the offset to 1 second expecting to see some kind of mismatch between the audio and the spectrum but only video players were affected. My spectrum stayed synchronized with audio.

violetmage commented 2 months ago

I'm a bit confused, it looked to me like the code in spectrum.cpp operated on chunks of samples, like std::array<float, n_bands> sample. Shouldn't it be enough to add a buffer in between where the mono samples are calculated, and when the mono samples are sent into the fft?

About the test you did; yes, that is the opposite behavior of what we want.

We want the spectrum to be influenced by the latency value (like the one in pavucontrol), because the latency offset is expressly for ensuring that video (which the spectrum is, in effect, generating) and audio (which the spectrum is reacting to) reach the user in sync at the physical level.

i.e. if you played one of those youtube videos where they have a spectrum to some music playing, the expected behavior by the user is for the easyeffects spectrum to be visually in sync with the spectrum in the video (plus maybe the effects pipeline latency), even if there is a latency value set in pavucontrol or whatever.

edit: forgot to answer the first part

wwmm commented 2 months ago

I'm a bit confused, it looked to me like the code in spectrum.cpp operated on chunks of samples, like std::array<float, n_bands> sample. Shouldn't it be enough to add a buffer in between where the mono samples are calculated, and when the mono samples are sent into the fft?

I am not sure yet. This code became as it is now to isolate the work done by the realtime thread from the one done by the main thread. The main thread calls Spectrum::compute_magnitudes() depending on the gtk_widget_add_tick_callback calls. A queue of buffers in the realtime thread probably won't do what we want. Maybe if it is inside Spectrum::compute_magnitudes() before copying the values to real_input.

As things are right now the frequency at which gtk_widget_add_tick_callback is called will interfere anyway. So maybe trying to do an initial implementation inside Spectrum::compute_magnitudes() isn't so bad.

violetmage commented 2 months ago

I figured doing the buffering in the realtime thread would allow for guaranteeing the delay be in units of time, instead of just number of samples.

Unless I'm misunderstanding, the frequency that Spectrum::process(...) gets called is determined by pipewire, but is much more reliable seeing as how it's realtime and that pipewire will happily report the metadata about the frequency and quantum.

That's why I thought the realtime thread was where it should happen. Unless there is a performance concern, it's probably worth trying. I haven't come up with anything else yet.

wwmm commented 2 months ago

Unless there is a performance concern, it's probably worth trying. I haven't come up with anything else yet.

Some users were having too much xrun errors because of too much work in the spectrum realtime calculations. That is why that code is complicated now.

violetmage commented 2 months ago

What if the spectrum realtime code was just a ring buffer for the input audio, and the GUI thread just read the sample pointed to by the realtime thread. Then all the processing, maybe even the mono combining, happened in the GUI thread.

wwmm commented 2 months ago

What if the spectrum realtime code was just a ring buffer for the input audio, and the GUI thread just read the sample pointed to by the realtime thread. Then all the processing, maybe even the mono combining, happened in the GUI thread.

It is more or less this way. It was what a collaborator did in the last patches to that file. The mono production doesn't need to get out of there. The FFT and some of the memory allocation were the main problem.

In any case it will probably be easier to let the ring buffer that is there as it is and implementing some proof of concept inside a function like Spectrum::compute_magnitudes() or similar that is running in the main thread. We still have to actually prove that this is going to have noticeable effects in this issue. So I would not try to get something super robust from the beginning. Specially if it has to touch complicated code.

violetmage commented 2 months ago

Would it be possible to resize db_buffers at runtime? If it is, naively, it looks like the only change that would be required is setting the size of db_buffers to however many samples makes up the requested delay time.

Obviously, resizing db_buffers would be relatively expensive, but it only would need to happen when the latency of the target output node changed.

edit: perhaps there is some merit to using / examining the dynamic delay code in lsp

violetmage commented 2 months ago

Also I found this: https://stackoverflow.com/questions/1120422/is-there-an-fft-that-uses-a-logarithmic-division-of-frequency

Naively it sounds to me as if this would allow using less fft samples, due to the non-linear spread of "fft bins". This works (at least in my head) because we don't care about the magnitude at 6000hz and 6001hz individually (to the human ear they're the same), but we really want to know the magnitudes at 10hz and 11hz (to the human ear they're very different).

edit: more googling reveals that this is apparently an established method and is called a "constant-q transform" or "constant-q Fourier transform"

edit 2: https://github.com/jmerkt/rt-cqt

wwmm commented 2 months ago

Would it be possible to resize db_buffers at runtime? If it is, naively, it looks like the only change that would be required is setting the size of db_buffers to however many samples makes up the requested delay time.

I think that many of the steps involved in the FFT calculation would also need changes. What is not ideal. Initially I was thinking about changing how the latest_samples_mono is being set and not touching the FFT code. But this may also not be so straightforward.

Also I found this: https://stackoverflow.com/questions/1120422/is-there-an-fft-that-uses-a-logarithmic-division-of-frequency

Naively it sounds to me as if this would allow using less fft samples, due to the non-linear spread of "fft bins". This works (at least in my head) because we don't care about the magnitude at 6000hz and 6001hz individually (to the human ear they're the same), but we really want to know the magnitudes at 10hz and 11hz (to the human ear they're very different).

A few years ago I saw this post. For some reason I gave up on that idea. I do not remember now why. Maybe because the non uniform frequency axis would cause problems somewhere else. In any case I would avoid messing with the FFT at this moment.

violetmage commented 2 months ago

Perhaps the easiest method could be to re-use the code in delay.cpp and instantiate another instance of the delay lv2 plugin within the spectrum, without the wrapper or going over pipewire. That way, at least the code should be simple enough, and changing the delay time would be trivial.

What do you think about that approach?

wwmm commented 2 months ago

Perhaps the easiest method could be to re-use the code in delay.cpp and instantiate another instance of the delay lv2 plugin within the spectrum, without the wrapper or going over pipewire. That way, at least the code should be simple enough, and changing the delay time would be trivial.

What do you think about that approach?

Although it isn't the most efficient approach instatiating the delay plugin inside the spectrum will definitely be the easiest. It is better to follow this path in the initial implementation.

wwmm commented 2 months ago

It is just a matter of creating 2 arrays inside the spectrum that will store the output of the delay plugin. And then act as source of the code that creates the mono array.

violetmage commented 2 months ago

Sounds good. I'm a bit busy today and tomorrow, but I should have time at the end of the week to try my hand at implementing something.

wwmm commented 2 months ago

Sounds good. I'm a bit busy today and tomorrow, but I should have time at the end of the week to try my hand at implementing something.

Ok. In the next days I will have to evaluate my students exams. So I do not know when I will have time to work on this.