Closed laenzlinger closed 1 year ago
Why does the plugin produce click noises though? This library is only analyzing samples, it is not modifying them.
Getting the short term loudness every 100ms shouldn't be a problem. The GStreamer audioloudnorm
plugin is doing the same (it's processing in 100ms chunks).
Thank you @sdroege for your reply and sorry for not quoting the source code of the plugin:
I did a test and removed just getting the loudness values (replaced that line with a static value). This solved the noise problem
I should also add that I tested the the same plugin with 2 different lv2 hosts:
Interestingly the click noises can only be heard on elk audio os
I will check out the audioloudnorm
plugin. Hopefully I can learn from there about how to properly implement real time audio plugins in rust (both are pretty new to me, I apologize if I am asking stupid questions here)
My guess would be that by the additional processing, in elk you're becoming too slow and get behind its latency deadline. You could measure how long that one line takes (std::time::Instant
for example) and check if you can reproduce the same effect if you sleep there for that long.
I just followed the first step: Measured the time that it takes to call ebu.loudness_shortterm()
: 13623ns
Interesting: I am running with a Sample Rate of 48000 Hz and a Frame Size of 64 which results in 13333ns I guess thats already a first indicator that the calculation takes too long?
When I run the same plugin with jalv on a non realtime system. the calulation takes 1460820ns. This would be 1.46s which sound wrong to me. I guess my measurements are wrong - or I did something wrong on the setup of the library?
Yeah, if processing each frame takes longer than the duration of the frame then you can't play it back in real time.
this is how I measured the time
let start = Instant::now();
let short_term = self.ebu.loudness_shortterm().unwrap();
let duration = start.elapsed();
ports.short_term.set(duration.as_nanos() as f32);
then I can read the value with jalv (monitor command)
Yes that's correct. You probably want to do this less often :)
I am not sure if I understood correctly. What do you mean by doing it less often? Wouldn't the processing take longer if loudness_shortterm()
is called less often (becuase there is more data to be processed)
I was hoping to find a way to run the processing more often, so that the required processing power could be evenly distributed. So when ever new samples are added, some part of the processing could be done. And then finally - once the result is required (in the recommended frequency of the spec) there is not much processing power needed anymore?
Wouldn't the processing take longer if loudness_shortterm() is called less often (becuase there is more data to be processed)
No, that would take approximately the same time independent of how much data was processed so far. What takes different time proportional to the number of samples is add_samples()
.
@sdroege thanks for the info. So if I understand correctly, I should also check how much time add_samples()
takes.
I have reduced now the calculation to only 'Momentary' https://github.com/pedalboard/loudness-meter.lv2/blob/main/src/lib.rs#L48 (removed I and S Mode)
I update the meter every 100ms. This seems to be possible within the real-time task.
I will do more measurements and report here.
I running the code on a Raspberry Pi CM4 (Compute Module). I wondered, if this algorithm is really so complex, that it takes so much time? Do you have any ideas for how I could optimise my code?
So if I understand correctly, I should also check how much time
add_samples()
takes.
No that's not what I meant :)
You were asking if loudness_shortterm()
would take more time if more samples were processed and that's not the case. It will always take approximately the same amount of time.
The only function that takes longer if you provide more samples is add_samples()
.
I wondered, if this algorithm is really so complex, that it takes so much time?
I'm not sure which part exactly you mean that takes more time than you would expect. 13us does not seem a lot for loudness_shortterm()
, but it also doesn't make sense to call this for every <13us of samples: it's the loudness over the last 3s, it's not going to change that rapidly.
Do you have any ideas for how I could optimise my code?
You're currently collecting all samples into a temporary Vec
in your run()
function. That's a heap allocation that you should probably avoid. Doing heap allocations as part of real-time audio processing is a bad idea. You could keep around an array in your struct and re-use that every time run()
is called, for example.
I don't see any other obvious places to optimize the code.
but it also doesn't make sense to call this for every <13us of samples: it's the loudness over the last 3s, it's not going to change that rapidly.
I think loudness_shortterm()
is not called every 13us. There is a guard around the call which should only match every 100ms.
let rate = self.ebu.rate() / 10;
if self.sample_count > rate {
// call shortterm_loudness()
}
This is based on the following recommendation in https://tech.ebu.ch/docs/tech/tech3341.pdf
- The short-term loudness uses a sliding rectangular time window of length 3 s. The measurement is not gated. The update rate for ‘live meters’ shall be at least 10 Hz.
This was the reason, why I initially started with calling shortterm_loudness()
every 100ms. Is this a bad idea?
You're currently collecting all samples into a temporary Vec in your run() function. That's a heap allocation that you should probably avoid. Doing heap allocations as part of real-time audio processing is a bad idea. You could keep around an array in your struct and re-use that every time run() is called, for example.
Oh thats a very good hint. Thank you very much. Real-time audio processing (and Rust) are both new to me. I need to learn a lot!
I think loudness_shortterm() is not called every 13us.
Not right now but from what I understood you did that earlier when you ran into problems (e.g. https://github.com/sdroege/ebur128/issues/53#issuecomment-1613568343 ). Did I misunderstand that part?
But also the 1.4s in the non-realtime case are surprising.
loudness_shortterm()
is running a calculation over the last 3s of samples that were added via add_samples()
. There is no caching or anything, so every time you call it, it will process 3s of samples. This should take the same amount of time no matter when or how often you call it.
OOC, are you doing a release build btw, or is this a debug build? That's going to make a quite big difference.
loudness_shortterm() is running a calculation over the last 3s of samples that were added via add_samples(). There is no caching or anything, so every time you call it, it will process 3s of samples. This should take the same amount of time no matter when or how often you call it.
Ok, with this information, i think i should:
a) run the loudness_shortterm()
less often (maybe once a second) and
b) run it as part of a lv2_worker task
The momentary measurements I would like to keep on a 10Hz frequency (as proposed by the EBU standard), for simplicity as part of the realtime thread. But I will repeat my time measurements. In the meantime I have upgraded to Sushi 1.1 which has improved to LV2 logging infrastructure. This should give me more confidence in my measurement results, which I previously exported via LV2 output parameters.
Would it make sense and be possible to implement some sort of caching to reduce processing power?
Does the same (non-caching) principle also apply to the momentary calculations?
OOC, are you doing a release build btw, or is this a debug build?
I am pretty convinced that I used a release build for my measurements. But I am going to repeat the tests and make sure that I measure with a release build
Would it make sense and be possible to implement some sort of caching to reduce processing power?
Possibly. You'd have to come up with a way for doing that efficiently first :) It needs a bit of thinking.
Does the same (non-caching) principle also apply to the momentary calculations?
Yes, it's exactly the same calculation but instead of over 3s it's only over 400ms of samples so should be ~7.5x faster.
I can report some new measurement results:
loudness_momentary()
is called with 10Hz frequencyThe measured values are between 103 and 111 (micro seconds)
I have also rewritten the code now to use a (very small 1 sample) buffer in the LV2 plugin instance: https://github.com/pedalboard/loudness-meter.lv2/blob/f02e89f11094d5a8f6c07bc44d549c853237d467/src/lib.rs#L66
As far as I understood, the buffer size should not have a huge impact on the performance. Is this correct?
Do you see anything else that could be improved?
As long as that's all inlined, it should be fine. If you have one actual function call per sample that's going to be quite a bit of overhead from the function calls themselves :)
Anything else left to be done here or can the issue be closed? :)
can be closed, thanks a lot for your help
I am trying to use the lib in an lv2 plugin. Is it possible to run the calculation for each frame to avoid high processing need when the result is calculated.
As per recommendation I would like to expose the short term value at a frequency of 10Hz and the integrated value on 1Hz.
When getting the short therm value every 100ms the plugin produces click noises, which I guess are caused by calculation of the short term value every 100ms. The sample rate is 48kHZ
Thank you very much for some suggestions how to improve the code. Let me know if I should provide more details.