facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.15k stars 2.01k forks source link

Audio files generated exporting the model are not of the same quality as the ones generated during training #428

Open lucacoma opened 4 months ago

lucacoma commented 4 months ago

Hello! thank you for the nice work and repo!

I've been training MusicGen (small) using DAC at 44 kHz to generate 5 second long samples.

I then exported the model following the instruction in this issue https://github.com/facebookresearch/audiocraft/issues/255 (nothing changes using the info in the README), the problem is that the quality of the samples generated is extremely lower with respect to the ones generated during training (using the same prompts). Am I doing something wrong with the model loading/exporting?

Here is the code snippet I am using to export the model and generate the audio

from audiocraft.utils import export
from audiocraft import train
import audiocraft.models
from IPython.display import Audio
from audiocraft.models import MusicGen
from audiocraft.solvers import MusicGenSolver
SIG_OF_LM = 'dff1057f' #  hash created to represent the dora experiment cfr. https://github.com/facebookresearch/audiocraft/issues/219
solver = MusicGenSolver.get_eval_solver_from_sig(SIG_OF_LM)
solver.model.cfg = solver.cfg
musicgen = MusicGen(name='mymusicgen', compression_model=solver.compression_model, lm=solver.model)
musicgen.set_generation_params(duration=5)  # generate 5 seconds.

descriptions = ['A loop of hats drums instruments in the genre of hip hop ']
audio_values = musicgen.generate(descriptions)  # generates 3 samples.
Audio(audio_values[idx,0].cpu().numpy(), rate=sampling_rate)