Closed f0k closed 5 years ago
I think that's a great idea, actually that's also closer to our original idea - to have MelFilterbank
class but I didn't come up with composing them. Which sounds even better.
yes, we discussed this before but didn't think that far ;-) I like your alternative as it would make separation/spectogram filtering tasks more elegant such as
X = stft(x)
Y_hat = model(spectrogram(X))
have convenience functions such as spectrogram and mfcc that return a nn.Sequential composition of the models.
yes, sounds really nice, indeed. However it would add a lot of overhead, when eg. spectrogram(X)
just would return the modulus, no?
Cool! having composition would also make adding the phase vocoder easier.
I like your alternative as it would make separation/spectogram filtering tasks more elegant
Ah yes, of course, that's a good use case. Sometimes you will want to have both the complex and the magnitude spectrogram, or both the full and the mel-scaled spectrogram, so it's good to have the transformations available separately.
X = stft(x) Y_hat = model(spectrogram(X))
I assume you're talking of the functional interface now. It would still be X = spectrogram(x)
, but also enable X = magnitude(stft(x))
or X = power(stft(x))
.
For the nn.Module
interface, we would have convenience functions like:
def melspectrogram(all, the, parameters):
return nn.Sequential([STFT(some, parameters), Magnitude(), MelFilterbank(some, other, parameters)])
Or maybe:
def melspectrogram(all, the, parameters):
return nn.Sequential([spectrogram(some, parameters), MelFilterbank(some, other, parameters)])
Hmm, the compositions will become a bit deep when they each wrap one more nn.Sequential
. I guess I'm still thinking too much in terms of Lasagne :) So maybe it should be:
def melspectrogram(all, the, parameters):
result = spectrogram(some, parameters)
result.add_module(MelFilterbank(some, other, parameters))
return result
(It's a pity that add_module()
does not return self
.)
The API currently favors inheritance over composition. For example,
Melspectrogram
extendsSpectrogram
, andmelspectrogram
callsspectrogram
. In a similar way,MFCC
would extendMelspectrogram
.This means, for example, that one cannot reuse the
Melspectrogram
implementation to work on a customSTFT
class. I don't have a specific use case in mind, just a sense of a missed opportunity.The alternative would be to have the
STFT
class, aComplexMagnitude
class, aMelFilterbank
class (to be derived from a genericFilterbank
class with corresponding defaults), and aDCT
class, and have convenience functions such asspectrogram
andmfcc
that return ann.Sequential
composition of the models.What do you think?