replace librosa.load with pysoundfile.read

faroit commented 8 years ago

I really like librosa but importing it only for the read/write capability can make things complicated. There are other libraries which only focus on audio I/O like pysoundfile and therefore have less dependencies and are more flexible.

This PR therefore removes librosa dependency and replace it with pysoundfile. I know that pysoundfile is not optimal because it not pure python but since timestretching/pitch shifting is often applied on general audio files a library which supports a large number of input and output formats like portaudio is beneficial. Also to my knowledge there are currently no better solutions for python.

I've also added a note howto easily install rubberband cli on mac by providing a homebrew recipe

bmcfee commented 8 years ago

Good call. I'm in favor of this change, but a couple of things:

Can you squash out the version update? I'd rather handle that in a separate PR, as there may be other changes before 0.1.1.
Can you add an explicit test for stereo and mono? PSF and librosa structure these differently (C vs fortran order), and we should take care to preserve formats.

faroit commented 8 years ago

:+1: will update the PR within the next days :-)

faroit commented 8 years ago

Since mono can be represented as (1, n) and as (n,): do you want to support input of shape=(n,) and shape=(c, n) with c > 1only?

I guess the librosa.load version did not support inputting shape=(1,n) while returning this shape at the output. This could easily implemented assuming reading the audio file as atleast_2d:

    y_out, _ = sf.read(outfile, always_2d=True)
    # make sure that output dimensions matches input
        if y.ndim == 1:
            y_out = np.squeeze(y_out.T).T
        else:
            y_out = y_out.T

Do you think this is useful or should we stick with a version that returns a 1d array even the input was 2d?

faroit commented 8 years ago

should be good now

bmcfee commented 8 years ago

Since mono can be represented as (1, n) and as (n,): do you want to support input of shape=(n,) and shape=(c, n) with c > 1only?

I guess the librosa.load version did not support inputting shape=(1,n) while returning this shape at the output. This could easily implemented assuming reading the audio file as atleast_2d:

Do you think this is useful or should we stick with a version that returns a 1d array even the input was 2d?

Thinking it over a bit, if we're completely removing librosa as a dependency and putting soundfile in its place, then it would make sense to just follow psf's conventions for buffer shaping. If we adopt soundfile conventions, a lot of these questions go away (right?).

I don't have a strong opinion on this, but I only have one package that depends on pyrubberband, and it's not a huge deal to wrap up the reshaping a layer higher if it keeps the implementation simple.

What do you think?

Otherwise, everything looks great! Thanks again.

faroit commented 8 years ago

Input shape/ndim matching output

personally I like the differentiation of 1d and 2d arrays in numpy but for many people it seems to be weird that single channel audio needs some extra care. Currently the only reason the output dimension changes (without the added logic) is because of reading and writing a temp wavfile. Therefore I like the idea that the output of this package has the same dimension and shape as the input. Also unfortunately pysoundfiles convention is rather unstable. The just recently changed the default value of always_2d from True to False.

C vs fortran order

It would make things a little simpler if we stick to the pysoundfile order, yes. I don't really prefer one to another.

bmcfee commented 8 years ago

personally I like the differentiation of 1d and 2d arrays in numpy but for many people it seems to be weird that single channel audio needs some extra care.

Since psf also follows this convention, I think we should stick to it. (I like 1d for mono as well, but that may change in the distant future...)

How about we just enforce that pyrubberband always returns data in the shape (ndim) received?

It would make things a little simpler if we stick to the pysoundfile order, yes. I don't really prefer one to another.

Okay, let's do it then.

faroit commented 8 years ago

I changed it so that the pysoundfile order is used. Also the tests are now done in py.test which makes it a lot easier to test different combinations of shapes, and input files.

Since PR will not breaks the API I would suggest changing the version number. Before merging I can also squash down my commits.

bmcfee commented 8 years ago

This all looks great, thanks!

I generally stick to nose, but pytest seems fine as well.

And yes, I'll bump up the version number after merging.

Is there anything else that needs to go into this PR? If not, let's squash some intermediate commits and merge!

bmcfee commented 8 years ago

Also, maybe now is a good time to add an AUTHORS.md, and add yourself? Or this can go into the sphinx docs; either is fine by me.

faroit commented 8 years ago

Great. we're good to go now. feel free to whatever you think is good about the author list. I can't really say that I've contributed much here ;-)

bmcfee / pyrubberband

replace librosa.load with pysoundfile.read #4

Input shape/ndim matching output

C vs fortran order