Sytronik / pyin-rs

pYIN pitch detection algorithm written in Rust
MIT License
9 stars 1 forks source link

more idiomatic api #4

Open m4b opened 7 months ago

m4b commented 7 months ago

Hello,

This is really cool you ported this algorithm to rust, thank you for all your work! Ideally a (rust) PSOLA algorithm with this would allow very simple but interesting audio manipulations of spoken voice, for example.

So I'm wondering if you have any plans to make the api more "idiomatic rust", specifically, here are a couple issues I encountered and which I think could improve:

  1. The wav input being a CowArray forces users to explicitly depend on ndarray; more idiomatic would be e.g., a &[T] or &[f32] or &[i16], depending on how this gets done. I also understand this gets more complicated when there are multiple channels, but it might be nice for a "simple" api that accepts a (&[f32], usize) or something that has num channels, etc., so there is less burden on the user to convert to appropriate types, etc. For example, this is a lot of type manipulations just to get that CowArray (I had to look at your binary to even see how to do this, since i don't use ndarray at all):
        let shape = (1, wav.len());
        let wav = Array2::from_shape_vec(shape.strides((1, shape.0)), wav).unwrap();
        let (f0, voiced_flag, voiced_prob) =
            pyin_exec.pyin(wav.remove_axis(Axis(0)).into(), fill_unvoiced, pad_mode);

    I'd prefer something like:

      let wav: Vec<f32> = ...;
      let (f0, voiced_flag, voiced_prob): (Vec<f32>, Vec<bool>, Vec<f32>) = pyin_exec.pyin_with(&wav, nchannels, fil_unvoiced, pad_mode);

    Note that the returned objects are regular vecs, and user doesn't need to know about ndarrays or cows arrays or whatever, etc.

  2. Reuse of storage; specifically, the api can be made backwards compatible by exposing a inout version (where you pass in the storage, e.g., f0 vec, voiced flag, voiced_prob via &mut), and the current api calls this internally with local vecs as the inouts and returns those. (of course this would also likely depend on above, using vecs as returns etc.)
  3. Not really api related, but if the number of deps the library uses could be trimmed down, that would be a plus too. Probably majority come from ndarray, I'm not sure. Similarly, there are some older crates (nalgebra 0.27, depended on by statrs, which rustc flags as having invalid code and will break at a future date) in the dep tree, that could use a version bump/update most likely.

Regardless whether you make it more idiomatic or not, thank you so much for writing this in rust, really great stuff! :)

m4b commented 7 months ago

I'd also (strongly) suggest that the binary, fii, have their deps split out; e.g., libc isn't required by the library, etc., clap isnt' required by the library, etc., and probably several others I haven't looked at. Additionally, the ffi stuff should be a feature flag so users don't need to compile/include it if only using the rust library. Much of this might partially/completely address the deps issue in point 3. above :)

m4b commented 7 months ago

One last point, you may want to consider a single return Vec, with e.g., Vec<Analysis>, where e.g.:

pub struct Analysis {
  freq: f32,
  voiced: bool,
  prob: f32,
}

Analysis is just a random name I chose, I'm sure there are better. This will have better cache locality, and users don't have to access other data via zipping, or indexing, etc.

I understand this diverges from the python api, but it is also more rustish to do this.

Sytronik commented 7 months ago

Thank you for your suggestions. I have not much experiences making Rust libraries. I will be grateful if you make PRs for them. If you can't, i will do it some day when possible, and will request your review.

Sytronik commented 1 month ago

I've worked some of your suggestions. Please check v1.2.0 release note. I couldn't do no. 2, 3 of your suggestions, because It required much work that I can't handle now. Again, I'll do them when possible. Thanks.