avalentino commented 2 years ago

This PR introduces a new interp1 function that allows resampling with non-uniform sample positions. The positions of the output samples (in normalized coordinates) are controlled by the 't_out` parameter.

Open points:

[x] the name of the function is currently interp1 but a better name could be used maybe
[x] parameters name and signature of the new interp1 function could be improved, in particular the name of the parameter to control the output samples position (t_out)
[x] the implementation of interp1, in particular the numba kernel (interp1_f), heavily reuses code form the resample_f. Both the resample and interp1 functions could use the same numba kernel properly adapted.
[x] the behavior of the new interp1 function is slightly different from the one of resample in case of sub-sampling. In the first case, in principle, is not possible to compute the scale factor. For this reason the filters parameters, rolloff in particular, shall be manually set by the user.

Closes #71.

bmcfee commented 2 years ago

Thanks for this! I'll try to give it a thorough review later on today (after more :coffee:). Some high-level thoughts:

Naming: I agree that interp1 isn't very helpful. Maybe something like resample_nu (non-uniform) for parity with non-uniform FFT methods? I don't have strong opinions here, so any suggestions are welcome.
I'd like to get coverage running on the CI before integrating functionality changes. This isn't difficult, just a matter of finding the time to set it up.
It would be nice if we could have a single, unified backend for both uniform and non-uniform sampling. But I don't think it's necessary, especially if it results in less readable code all around.
Could you elaborate a bit more on the last point? I see how scale is not well-defined here, but I'm not sure how that translates into a change to the API for filter design (eg rolloff).

I'm also wondering if it's possible here to support non-uniform input spacing as well as output. This certainly makes the window interpolation logic a bit more finnicky, but it would enable subsequent resampling of the output and that seems like a desirable feature.

avalentino commented 2 years ago

Thanks for this! I'll try to give it a thorough review later on today (after more coffee). Some high-level thoughts:

Naming: I agree that interp1 isn't very helpful. Maybe something like resample_nu (non-uniform) for parity with non-uniform FFT methods? I don't have strong opinions here, so any suggestions are welcome.

OK, resample_nu looks nice. The only point is that if we decide to implement also a function to support non-uniform input then we will need a different naming. And what about t_out?

I'd like to get coverage running on the CI before integrating functionality changes. This isn't difficult, just a matter of finding the time to set it up.

Do you mean something like to following: https://github.com/marketplace/actions/pytest-coverage-commentator? Or do you have something different in mind? If the setup does not requires API tokens, then I could implement it as part of this PR or as a separate one. Please let me know.

It would be nice if we could have a single, unified backend for both uniform and non-uniform sampling. But I don't think it's necessary, especially if it results in less readable code all around.

I think that the current interp1_f could be used also for resample with few modifications:

the scale parameter could be an additional input
we need to compute the t_out array in resample and pass it to the kernel as a parameter (it should be not much more than using np.arange)

Could you elaborate a bit more on the last point? I see how scale is not well-defined here, but I'm not sure how that translates into a change to the API for filter design (eg rolloff).

As far as I can understand the `scale parameter is used

as a multiplicative factor for the filter (in resample) and
to perform a stretching of the time axis to access the filter samples: `filter(t) --> filter(k*t)'

The two things together basically consist in computing a sinc based FIR filter with a different rolloff=scale parameter: 1.0 * sinc(t) --> scale * sinc(scale * t). Is my understanding correct?

If so, the user could leverage the rolloff parameter in case of sub-sampling to get an equivalent effect of what is currently done in the resample routine. An alternative could be to perform an analysis of t_out use some heuristics to derive a reasonable scale value but I strongly prefer to avoid it: "In the face of ambiguity, refuse the temptation to guess." :)

I'm also wondering if it's possible here to support non-uniform input spacing as well as output. This certainly makes the window interpolation logic a bit more finnicky, but it would enable subsequent resampling of the output and that seems like a desirable feature.

Yes I agree, It should not be too complex to implement and probably the data regularization is an interesting use case for this feature. What would be your priority for it? In my plans, and compatibly with the my fee time, I have implementation of a 2D interpolator (regular to irregular grid) and maybe some performance improvement if possible (e.g. experiment with vectorization and/or OpenMP based multi-threading). Also the possibility to use shorter LS FIR filters could be an interesting use case.

bmcfee commented 2 years ago

OK, resample_nu looks nice. The only point is that if we decide to implement also a function to support non-uniform input then we will need a different naming. And what about t_out?

I'm okay with t_out, especially if we also support non-uniform input spacing. The latter would suggest t_in in that case.

OTOH if we only support uniform→non-uniform, then maybe t would suffice.

Do you mean something like to following: https://github.com/marketplace/actions/pytest-coverage-commentator? Or do you have something different in mind? If the setup does not requires API tokens, then I could implement it as part of this PR or as a separate one. Please let me know.

I actually took care of this already #83 ; it should appear automatically if you rebase this PR.

I think that the current interp1_f could be used also for resample with few modifications:
* the `scale` parameter could be an additional input

* we need to compute the `t_out` array in `resample` and pass it to the kernel as a parameter (it should be not much more than using `np.arange`)

Yeah, I think that makes sense, and it would likely clarify the existing implementation a little by more clearly separating sample indices from the corresponding natural coordinates.

The two things together basically consist in computing a sinc based FIR filter with a different rolloff=scale parameter: 1.0 * sinc(t) --> scale * sinc(scale * t). Is my understanding correct?

I believe so. The catch here is that in normal usage, the filters have already been constructed with a fixed rolloff that is independent of scale. I generally think of rolloff as a quality parameter, while scale is adaptive and depends on the ratio between the two sampling rates.

The inner part, sinc(scale * t) is really just stretching the time axis according to the two sampling rates in question. In the case of non-uniform sampling, this isn't necessary, since we'd just operate in absolute time coordinates without having a global stretch or compression of the time axis.

The outer part is where it gets a bit trickier, as you note. This scaling comes from JOS3's documentation, and compensates for energy loss when downsampling. I'm not sure what the right way to think about this is in the context of non-uniform sampling. There's obviously no global scale parameter at play here, and I'd be cautious of trying to infer a global approximation.

I wonder if it makes sense to do something local, eg looking at sample spacing at indices (n-1, n, n+1)? We recently did something along these lines for local (frequency-varying) Q-factor calculations in a variable-Q transform https://github.com/librosa/librosa/pull/1405 - maybe similar reasoning can apply here. My main concern here is that implementing a time-varying scale correction would wreck much of the efficiency of the algorithm, since we can't rely on the table of filter coefficients being fixed across output samples.

And the more I think about it, the more I'm convinced that this also applies in the uniform→nonuniform case, unless we're in strict interpolation (oversampling) mode. In the nonuniform case, this would correspond to having all t_out[n+1]-t_out[n]>=1/fs.

Yes I agree, It should not be too complex to implement and probably the data regularization is an interesting use case for this feature. What would be your priority for it?

I'm not sure yet - prioritizing the nonuniform input case may depend on how the above issue resolves out. It may be that there's no way around doing both together, in which case it's a higher priority.

avalentino commented 2 years ago

I actually took care of this already https://github.com/bmcfee/resampy/pull/83 ; it should appear automatically if you rebase this PR.

OK #83 seems to break CI on forks. Maybe you can just drop the token form ci.yml. AFAIK it is not necessary for public repositories.

The outer part is where it gets a bit trickier, as you note. This scaling comes from JOS3's documentation, and compensates for energy loss when downsampling. ...

I agree that this part indeed can be tricky. In my experience in the Synthetic Aperture Radar processing the normalization (compensation of the energy loss) is often preformed outside the interpolator using formulas that take into account characteristics of both the filter and the signal. I would like to be able to use this approach, so, if we decide to implement the adaptive normalization, maybe we could also allow to disable it if necessary.

bmcfee commented 2 years ago

OK #83 seems to break CI on forks. Maybe you can just drop the token form ci.yml. AFAIK it is not necessary for public repositories.

Strange - it shouldn't hurt anything to include it, but i've pushed a change removing it for now. Try a rebase?

In my experience in the Synthetic Aperture Radar processing the normalization (compensation of the energy loss) is often preformed outside the interpolator using formulas that take into account characteristics of both the filter and the signal. I would like to be able to use this approach, so, if we decide to implement the adaptive normalization, maybe we could also allow to disable it if necessary.

Yeah, that makes total sense. Generally I think moving normalization out to an in-place multiplication would be a good idea. It will cost an extra O(n) pass on the data, but the code will be clearer and easier to generalize.

codecov-commenter commented 2 years ago

Codecov Report

Merging #82 (4d137ba) into main (5f46888) will decrease coverage by 0.66%. The diff coverage is 83.87%.

@@            Coverage Diff             @@
##             main      #82      +/-   ##
==========================================
- Coverage   85.13%   84.46%   -0.67%     
==========================================
  Files           4        4              
  Lines          74      103      +29     
==========================================
+ Hits           63       87      +24     
- Misses         11       16       +5

Flag	Coverage Δ
unittests	`84.46% <83.87%> (-0.67%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
resampy/core.py	`85.96% <83.87%> (-3.33%)`	:arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5f46888...4d137ba. Read the comment docs.

avalentino commented 2 years ago

OK @bmcfee, I think that the PR is ready for the final review. Support for non-uniform input could eventually go in a separate PR.

avalentino commented 2 years ago

@bmcfee is there anything blocking?

bmcfee commented 2 years ago

@bmcfee is there anything blocking?

Just the end of the semester over here. And the ISMIR deadline. Sorry, will try to come back to this once more pressing issues clear up.

bmcfee commented 2 years ago

@avalentino sorry for the delay, just coming back to this now. Please see the review comment above - I'm confused by the lack of sr_orig in non-uniform resampler's interface. Surely we need some mechanism to relate the sample positions of x to the target times t_out?

bmcfee commented 2 years ago

Thanks @avalentino - everything looks good now aside from that one typo. (At some point I should add CI for spellchecking like we have in librosa, but that can be treated separately.)

bmcfee commented 2 years ago

Ok, I think this is good to go. I'll try to block out some time next week to do the rest of the infrastructure updates and bug fixes to get a release going. Anything else before I merge?

avalentino commented 2 years ago

Ok, I think this is good to go. I'll try to block out some time next week to do the rest of the infrastructure updates and bug fixes to get a release going. Anything else before I merge?

Thanks @bmcfee I think that this PR is now complete Please go ahead with the merge at your convenience.

bmcfee / resampy

Non-uniform sample positions #82

Codecov Report