dofuuz / python-soxr

Fast and high quality sample-rate conversion library for Python
Other
75 stars 6 forks source link

resample_chunk and resample methods return very different results #7

Closed mariano-balto closed 2 years ago

mariano-balto commented 2 years ago

It seems that resample_chunk from the ResampleStream class returns very different results than just using soxr.resample. Is this the expected result?

dofuuz commented 2 years ago

@mariano-balto No. Can you provide code and data to reproduce?

mariano-balto commented 2 years ago

@dofuuz thanks for your quick reply, here is a quick test.

import numpy as np
import soxr
from soxr import ResampleStream, VHQ

in_rate = 44000
out_rate = 16000
quality = VHQ

stream = ResampleStream(
    in_rate=float(in_rate),
    out_rate=float(out_rate),
    num_channels=1,
    quality=quality,
)

def test_resample_and_resample_stream_results() -> None:
    data = b"\x00" * 1024

    data16 = np.frombuffer(data, dtype=np.int16)
    resample = soxr.resample(data16, in_rate=in_rate, out_rate=out_rate, quality=quality)
    resample_stream = stream.resample_chunk(data16)
    assert resample == resample_stream

Perhaps this is an edge case or maybe I am doing something wrong?

dofuuz commented 2 years ago

ResampleStream's default dtype is float32. So you have to specify dtype if you want to use int16.

stream = ResampleStream(in_rate=in_rate, out_rate=out_rate, num_channels=1, quality=quality, dtype=np.int16)

If you don't, it'll be converted to float32. I'll add warning about conversion.

You have to flush stream using last=True on end of input.

resample_stream = stream.resample_chunk(data16, last=True)

You should compare two results using np.allclose().

assert np.allclose(resample, resample_stream, atol=2)

For int i/o, soxr uses dithering. So output can have difference ±1.

mariano-balto commented 2 years ago

I see. Thanks for your detailed explanation; if I have an infinite stream, how often do I need to flush the stream?

dofuuz commented 2 years ago

If stream is infinite, you should not flush the stream.

Output has little delay, so you have to put more input to get some output.

for _ in range(10):
    resample_stream = stream.resample_chunk(data16)
    print(len(resample_stream), end=' ')

Output:

0 0 0 0 0 0 0 187 187 187
mariano-balto commented 2 years ago

@dofuuz thanks again for your great explanation