filoe / cscore

An advanced audio library, written in C#. Provides tons of features. From playing/recording audio to decoding/encoding audio streams/files to processing audio data in realtime (e.g. applying custom effects during playback, create visualizations,...). The possibilities are nearly unlimited.
Other
2.14k stars 450 forks source link

How to resample when input is an array of floats. #462

Closed ADD-eNavarro closed 2 years ago

ADD-eNavarro commented 2 years ago

Hello there. I am trying to adapt to C# some python voice recognition preprocessing code. I need to read a file, resample it, normalize and trim silences. All parts are working fine except for the resampling one, where I can't get the same results as the original code. So I am testing a few sound libraries to select the one that fits my needs best. The issue is, I need to take the float array (output from python) and feed it into the resampling part of CSCore, to be able to compare the result with the original. I haven't found a way to do that. So far I have created the input and output formats as CSCore.WaveFormat, no problem there. After much looking I've used a WriteableBufferingSource, where I've copied ( using Write() ) a byte version of my input array. Then I've created a DmoResampler with this WriteableBufferingSource and my outputFormat, Read through it ... but the result has nothing to do with what's expected (expecting values under 1, got results of thousands, 25000 or so). Any hint?

filoe commented 2 years ago

You may post an example?

Am 25.10.2021 um 15:01 schrieb ADD-eNavarro @.***>:

 Hello there. I am trying to adapt to C# some python voice recognition preprocessing code. I need to read a file, resample it, normalize and trim silences. All parts are working fine except for the resampling one, where I can't get the same results as the original code. So I am testing a few sound libraries to select the one that fits my needs best. The issue is, I need to take the float array (output from python) and feed it into the resampling part of CSCore, to be able to compare the result with the original. I haven't found a way to do that. So far I have created the input and output formats as CSCore.WaveFormat, no problem there. After much looking I've used a WriteableBufferingSource, where I've copied ( using Write() ) a byte version of my input array. Then I've created a DmoResampler with this WriteableBufferingSource and my outputFormat, Read through it ... but the result has nothing to do with what's expected (expecting values under 1, got results of thousands, 25000 or so). Any hint?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

ADD-eNavarro commented 2 years ago

Sure!
Here's the way I got to produce an array:

function

And here the result and expectation:

Result

filoe commented 2 years ago

What is reader.WaveFormat.BitsPerSample? What is the BitsPerSample of your "expected result"? And what is the encoding of it? IeeeFloat?

You may want to try something easier.

var resampledSource = reader.ChangeSampleRate(Hyperparameters.SamplingRate)
    .ToSampleSource();

float[] buffer = new float[resampledSource.WaveFormat.SampleRate];
resampledSource.Read(buffer, 0, buffer.Length);

I do not have an editor here and I have not tested it but you should get an idea of what i mean.

This resamples the input wave and converts the output to an IeeeFloat format (the SampleSource is always IeeeFloat, the WaveSource is Pcm).

ADD-eNavarro commented 2 years ago

Yes, sorry for the lacking context. reader.WaveFormat.BitsPerSample = 16. My input audio will be a wav file, that I will load and turn to mono (and from there extract the raw audio as array of 32-bit floats, that's a requirement) before the resampling process. As of now, reader is NAudio.Wave.WaveFileReader, since I started tinkering with NAudio and managed to get the exact same output as in the original code. One option would be to load the file with CSCore, check if the loaded result is the same when turned into float array, and from there do the resampling. I'm looking into this. Right now, reader doesn't have a ChangeSampleRate method.

filoe commented 2 years ago

I would suggest not to mix libraries. You can use the WaveFileReader or the CodecFactory to open the file or any stream.

You can then chain up all sources within cscore. Also there are extension methods (FluentExtension class in CSCore namespace) which allow you to chain up the sources easier – the ChangeSampleRate method is one of the them.

Von: ADD-eNavarro @.> Gesendet: Donnerstag, 28. Oktober 2021 10:15 An: filoe/cscore @.> Cc: Florian @.>; Comment @.> Betreff: Re: [filoe/cscore] How to resample when input is an array of floats. (Issue #462)

Yes, sorry for the lacking context. reader.WaveFormat.BitsPerSample = 16. My input audio will be a wav file, that I will load and turn to mono (and from there extract the raw audio as array of 32-bit floats, that's a requirement) before the resampling process. As of now, reader is NAudio.Wave.WaveFileReader, since I started tinkering with NAudio and managed to get the exact same output as in the original code. One option would be to load the file with CSCore, check if the loaded result is the same when turned into float array, and from there do the resampling. I'm looking into this. Right now, reader doesn't have a ChangeSampleRate method.

— You are receiving this because you commented. Reply to this email directly, https://github.com/filoe/cscore/issues/462#issuecomment-953607757 view it on GitHub, or https://github.com/notifications/unsubscribe-auth/ABHXVYOPVC23FZA7L4A6TZLUJEIB5ANCNFSM5GVHF75Q unsubscribe. Triage notifications on the go with GitHub Mobile for https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 iOS or https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub Android. https://github.com/notifications/beacon/ABHXVYI3V5N4B7NQRHO6L43UJEIB5A5CNFSM5GVHF752YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHDLOMTI.gif

ADD-eNavarro commented 2 years ago

Exactly, I was just looking into that (getting rid of NAudio in the process). I ended up using what you suggest (CodecFactory to open the file, .ToSampleSource and then applying .ChangeSampleRate). The result is close enough to what I need. I still need to work with a stream of floats, but I believe I can work from here. Thanks a lot!