dhgrs / chainer-ClariNet

A Chainer implementation of ClariNet.
45 stars 13 forks source link

Is it possible to put the acoustic features from outside the script? #11

Closed wada-s closed 6 years ago

wada-s commented 6 years ago


I would like to experiment with another acoustic features already extracted from speechs. I am trying it in several ways, but for now it does not work well. Could you tell me if there is any good way?

dhgrs commented 6 years ago

Preprocessing class and function are in utils.py. Please overwrite this. https://github.com/dhgrs/chainer-ClariNet/blob/71a3c8b443159aa008955d3a1538d82294852e22/AutoregressiveWaveNet/utils.py#L63-L69 You can replace spectrogram with your own acoustic features. For example, if the features are in .npy in same directory as .wav,

feature = numpy.load(path.replace('wav', 'npy'))
return raw[:, :-1], feature, raw[:, 1:]

You have to be carefull for alignment. Random clipping or padding are applied into raw audio in my implementation. https://github.com/dhgrs/chainer-ClariNet/blob/71a3c8b443159aa008955d3a1538d82294852e22/AutoregressiveWaveNet/utils.py#L34-L44 So you have to apply clipping/padding to your feature with same index.