Initial condition support for torchaudio.functional.lfilter

shchhan123456 commented 2 years ago

🚀 The feature

scipy.signal.lfilter supports an initial condition zi. This is critical for dealing with streaming inputs where we get one sample at a time. This feature is currently missing in torchaudio.functional.lfilter.

Motivation, pitch

Without support for initial condition, I had to maintain a large sliding window of signals and apply the filter on this entire window every time I get a new input. If the initial condition is supported, then it becomes much more efficient. For my particular use case, I've been porting scipy based code to pytorch and ran into this issue.

Alternatives

No response

Additional context

No response

yoyololicon commented 2 years ago

Hi, @shchhan123456. I found some difficulties to make this feature. scipy.signal.lfilter uses a direct form II transposed implementation, so the initial condition zi is simply a vector. In contrast, torchaudio uses direct form I, so if it needs initial conditions it would be two vectors (the x,y inputs to scipy.signal.lfiltic). We chose the direct form simply because we haven't figured out an efficient solution to implement gradients for transposed forms.

However, if you just need the streaming feature and don't mind that the values of zi are different from scipy implementation, I would suggest an alternative solution that requires less effort than completely refactoring the code. We can transform the current implementation to direct form II by simply swapping the order of FIR and IIR, filling zi to here and there, and return the last few samples of the intermediate outputs as zf. It makes little changes to the backward function.

@mthrok @nateanl Could you help take a look at this feature request and my suggestions? I'm not sure whether it fits the current development plan of lfilter.

shchhan123456 commented 2 years ago

@yoyololicon , thanks for your reply! I don't fully understand why zi would be different from scipy implementation. Although my overall request is the support for streaming functionality. I'd be really glad if this can be supported.

nateanl commented 1 year ago

Hi @yoyololicon and @shchhan123456. Sorry for my late response. I think swapping FIR and IIR makes sense for converting direct form I to direct form II. @yoyololicon could you elaborate how the zf is generated in direct form II? If the implementation is changed from direct form I to direct form II, the format of zi should be the same in scipy.signal.lfilter, which is a vector. Is that correct?

Adding zf to the returned output will make the API consistent with scipy, but it will be BC-breaking. @mthrok What do you think of the change proposal of lfilter?

yoyololicon commented 1 year ago

Hi @nateanl, no need to apologize, I was also working on other things.

An example of how to get zf value:

std::tuple<torch::Tensor, torch::Tensor> lfilter_core(
...

  int64_t n_order = b_coeffs.size(1);

  // IIR first
  auto filtered_waveform = DifferentiableIIR::apply(
      waveform,
      a_coeffs /
          a_coeffs.index(
              {torch::indexing::Slice(), torch::indexing::Slice(0, 1)}));

  // get zf
  auto zf = filtered_waveform.index(
              {torch::indexing::Slice(), torch::indexing::Slice(), 
               torch::indexing::Slice(filtered_waveform.size(2) - n_order + 1, filtered_waveform.size(2))});

  // Then, FIR
  auto output = DifferentiableFIR::apply(
        filtered_waveform,
        b_coeffs /
            a_coeffs.index(
                {torch::indexing::Slice(), torch::indexing::Slice(0, 1)}));
  return {output, zf};
}

If the implementation is changed from direct form I to direct form II, the format of zi should be the same in scipy.signal.lfilter, which is a vector. Is that correct?

Partially correct. The format of zi, zf will be the same with SciPy, but their actual meaning will not be. These two vectors represent the states of the registers inside the filter structure. If we look into their corresponding digram:

Transposed form

Direct form

you can notice that, in transposed form, the signal flows into the delay line in the middle; in direct form, it flows outwards. The values inside the registers are not the same among different forms.

If it wasn't clear, my proposal is just a workaround for the streaming feature when dealing with real-time applications, and it doesn't behave the same as SciPy implementation. It's a BC-breaking change, indeed.

RichieHakim commented 3 months ago

I'm interested in this as well. I believe this would allow for FIR filtfilt as well, which is extremely useful.

pytorch / audio