jstrait / wavefile

A Ruby gem for reading and writing sound files in Wave format (*.wav)
https://wavefilegem.com
MIT License
209 stars 24 forks source link

Enable resampling #1

Open padde opened 12 years ago

padde commented 12 years ago

Consider the following code:

require 'wavefile'
include WaveFile

#1 second 440 Hz square wave
format = Format.new(:mono, 16, 44100)
writer = Writer.new("square.wav", format)
cycle = ([2**15] * 50) + ([-2**15] * 50)
buffer = Buffer.new(cycle, format)
441.times do 
  writer.write(buffer)
end
writer.close()

# square.wav now has 44100 samples

# read square wave in at half the sampling rate
samples = []
format = Format.new(:mono, 16, 22050)
reader = Reader.new("square.wav", format).each_buffer(1024) do |buffer|
  samples += buffer.samples
end

puts "#{samples.length} samples read"
# outputs "44100 samples read"

I would assume to get 22050 samples instead of 44100, thus a resampled version of the file. Am i the only one feeling this way?

jamestunnell commented 11 years ago

I agree that some kind of resampling behaviour would be desirable. But I don't think it needs to be tied up in the Reader class. It could be provided by a utility class instead.

Even as a sperate class, sample rate conversion actually seems quite involved (to do it right), just based on a bit of reading on wikipedia.

padde commented 11 years ago

I guess this is hard to do in plain Ruby, anyway. But libsamplerate (http://www.mega-nerd.com/SRC/) looks promising!

jamestunnell commented 11 years ago

That does look like a good library, and I bet the code could be ported to Ruby without too much trouble.

Unfortunately, the license is GPL, which is rather restrictive and I think totally incompatible with MIT. Too bad...

jamestunnell commented 11 years ago

This paper looks pretty promising: "THE QUEST FOR THE PERFECT RESAMPLER" by Laurent de Soras http://ldesoras.free.fr/doc/articles/resampler-en.pdf

jamestunnell commented 11 years ago

Actually, this one looks even better: "Polynomial Interpolators for High-Quality Resampling of Oversampled Audio" by Olli Niemitalo http://www.student.oulu.fi/~oniemita/dsp/deip.pdf

The paper discusses a hybrid solution of first oversampling audio with discrete methods (FIR filter) and then interpolate with a polynomial interpolator.

I think I'll try to get such a hybrid method implemented as part of https://github.com/jamestunnell/spcore

jamestunnell commented 11 years ago

I added resampling functions to my all-ruby signal processing library, spcore (see https://github.com/jamestunnell/spcore). With a little bit of application logic you could add a resampling feature without too much trouble.

jstrait commented 2 years ago

To celebrate the 10th anniversary of this issue being opened I thought I would chime in! 😉

This issue raises a good point - I agree it's confusing that it's possible to convert the channel count or sample format of a sample buffer, but not the sample rate. However, in my opinion I don't think adding support for resampling would be the right thing to do.

My understanding (not an expert) is that there's not a single canonical way to resample, and that it is somewhat complicated compared to changing the number of channels or sample format. I wouldn't want to make an assumption about what resampling method to use or add the relatively large amount of code that would have to be maintained and that I wouldn't be 100% confident about getting right.

Although a 3rd party library like spcore could be used to handle resampling, I think resampling is something better handled outside of the wavefile gem. I think conceptually this gem is better served focusing on the low level details of shuttling data to/from a *.wav file, and not being involved with anything DSP related. If I had a time machine I might even also go back and not add the existing channel count/sample format conversions. Although it is convenient to be able to transparently convert channel count/sample formats, it leads to a slippery slope toward adding functionality that maybe belongs elsewhere, and has led to an API inconsistency.

That said, that ship has sailed and I don't think it would be a good idea to remove the conversion functionality that already exists. Perhaps in the future a different API could be added that makes it more clear that the sample rate is an informational field (like the Format::speaker_mapping field that was added in v1.0.0) and isn't "convertible". I'm not sure what that API would be though, and it would need to be something that can co-exist alongside the current API. Since the Format class is pretty central to the gem, changing the API in a backwards incompatible way would break the code of a large number of people (most?) already using the gem.

I'll keep this issue open in case anyone has any further comments, and if there is no response after some amount of time will close it.