sfztools / sfizz

SFZ parser and synth c++ library, providing a JACK standalone client
https://sfz.tools/sfizz/
BSD 2-Clause "Simplified" License
394 stars 58 forks source link

audible artefacts and loud pops when oversampling in VCV Rack #638

Closed doveraudio closed 3 years ago

doveraudio commented 3 years ago

Just gave the project *(the SFZ player, vst3) a spin in VCV Rack on windows, using the Host XL module. It may have been this machine, so I'll report more after further testing. Used some of the instruments in Edge-blown Aerophones from the vcsl SFZ library. Ocarina, and Pipe Organ, and found the issue when using oversampling. If this is a vcv rack issue it's fine, but if it's an issue with the project here i'd be happy to keep testing the issue. I have several other hosts as well, but like to use vcv rack for impromptu stuff.

PythonBlue commented 3 years ago

For the record, which method of oversampling did you use? if you used the "hint_min_samplerate" opcode, speaking as the contributor of that code, I fully admit it's a bit rough around the edges, and only recently did I resolve similar issues in my own fork.

doveraudio commented 3 years ago

i updated my original post to specify i was using the vst3 plugin hosted in vcv rack. hope this clarifies.

jpcima commented 3 years ago

I have several other hosts as well, but like to use vcv rack for impromptu stuff.

I wouldn't be surprised if the issue is general. Problems with oversampling were reported several times. I've not had yet the opportunity to look into it too much, but I'll leave some notes about the status.

  1. sfizz's current method by @paulfd is based on upsampling only, when the sound file is loaded. The downsampling is done when sound is resampled for playback, as 2-in-1 operation. The DSP load is unaffacted by choosing a higher OS factor, but puts more work on the background loader.

  2. the method by @PythonBlue performs up+downsampling at synth level, which about multiplies DSP load by the OS factor. Decimator uses a 6th order LPF biquad. It supports factors greater that 16x.

If I were to make a very wild guess, perhaps that method (1) degrades quality, when the system increases the OS but does not compensate by interpolating with a greater number of neighbor points. Not sure if that idea holds theoretically, but perhaps @paulfd can tell. Then a switch to the method (2) is possibly justified.

I'd rather to study the method (2) for possibilities to extend L. de Soras' hiir method with oversampling by powers of 2 up to the factor 128x (as replacement of LPF6p). (hiir package contains a filter designer source code)

paulfd commented 3 years ago

If I were to make a very wild guess, perhaps that method (1) degrades quality, when the system increases the OS but does not compensate by interpolating with a greater number of neighbor points. Not sure if that idea holds theoretically, but perhaps @paulfd can tell. Then a switch to the method (2) is possibly justified.

I'd rather to study the method (2) for possibilities to extend L. de Soras' hiir method with oversampling by powers of 2 up to the factor 128x (as replacement of LPF6p). (hiir package contains a filter designer source code)

You may be right. What you're saying is kind of a "Fourier dual" intuition. I am not sure if this holds in this interpolation context but it might! When I thought about it it was mostly with lerp in mind, and thinking this could be a good tradeoff since the background loading would do the heavy lifting.

However, even if we could make the method work reliably and without any mathematical drawbacks, it still comes with several problems of the top of my mind:

  1. For many sample libraries, the memory cost is just huge, since you pay the same factor in memory than in oversampling.
  2. It makes it harder to reason about things like loop points, and complexifies a large part of the code for something that is barely used in practice.
  3. It's confusing for users.

For these reasons alone I'd be tempted to just drop this functionality and concentrate on improving the basic interpolation methods on all target platforms. I think most people would be satisfied with a high performing polynomial method for live use and something like the sinc @jpcima implemented recently for offline rendering. Maybe there's no need to be fancier.

jpcima commented 3 years ago

Under either method, oversampling will be confusing.

By intuition, I'm guessing that the difference comes from the interpolators using a fixed point count per frame, as opposed to resampler (eg libsamplerate) which processes a dynamic point count based on the ratio.

For these reasons alone I'd be tempted to just drop this functionality

If not for quality reasons, these are remaining reasons to keep it