Closed falkTX closed 1 year ago
Coincidently, I've been looking at this recently as well. I don't fully understand why the WaveNet models require a warmup period. I would have thought that they would be deterministic based on the current receptive field (unlike LSTM, which has a feedback connection).
Btw, there is an issue open in the NAM core repo related to this:
https://github.com/sdatkinson/NeuralAmpModelerCore/issues/61
So, I think this is a better solution than the current anti-pop method.
That said, I think it makes more sense to have this be handled in the core model loading code. Both because it is in one place there, and also because it depends on the model type. In the case of LSTM models, for example, it isn't required at all and is just wasting background resources. For WaveNet, my guess is that the number of samples required to stabilize the model depends on the size of the receptive field.
On AIDA-X that uses LSTM and GRU this was found to be necessary as well, so maybe the "not needed" thought is mostly just theoretic?
If the NAM core side ends up with something similar it will still need the pre-roll for the buffer resize, specially because it seems resizing to bigger buffers does not clear the new memory (or something alike that, there is a PR for it still open)
On AIDA-X that uses LSTM and GRU this was found to be necessary as well, so maybe the "not needed" thought is mostly just theoretic?
NAM stores pre-"warmed" weights for LSTM models, while AIDA doesn't.
I'm going to close this, as I've got a PR (https://github.com/sdatkinson/NeuralAmpModelerCore/pull/71) in to handle it in the NAM core.
The change has also been added to my devel branch of NAM core that this repo references.
While playing with NAM we noticed a few audio pops when changing models. The NAM core side already has an "anti-pop" feature of sorts, but it is not applied for all model types and not in the same way. Since we already pre-run the plugin in order to force the correct buffer size, we can run it a few more times to get rid of the initial pops. The 4096 comes from checking NAM code where 4000 is used, and 4096 aligns nicely close to it as a power of 2 (which will be typical for buffer sizes in hosts)