Closed chunping-xt closed 4 months ago
@chunping-xt
We can use any sample rate, because we only use encoder and decoder from Encodec. (works for me)
from audiolm_pytorch import EncodecWrapper
import torchaudio as ta
from einops import rearrange
import IPython.display as ipd
encodec = EncodecWrapper()
audio, sr = ta.load("")
print(sr) # sr == 44100
codecs = encodec.model.encoder(rearrange(audio, f'b t -> b {encodec.model.channels} t'))
audio_gen = encodec.decode(codecs.transpose(1, 2))
ipd.display(ipd.Audio(audio_gen.detach().cpu().numpy()[0][0], rate=44100))
Thanks for your very carefully written codes, I can train the model smoothly and very efficiently, I don't have any problems during training. I have questions: