facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.18k stars 2.01k forks source link

Generating audio samples from the same seed ? #373

Open timtensor opened 6 months ago

timtensor commented 6 months ago

Hello I am using the following workflow to generate audio samples using audio craft. Basically I am outputting from the llm (in this case Zephyr 7b mistral) to be used as an input. I am wondering how can i

  1. Set the seed so the generations are consistent. 2 Is there a way to generate longer samples for example having the same seed
    Generate an intro for 30 s with xyz 
    Generate verse for 30s with zlm

    The workflow is listed below

!pip install -q --upgrade huggingface_hub git+https://github.com/huggingface/transformers.git
client = InferenceClient(model="HuggingFaceH4/zephyr-7b-beta",
                         token=HF_TOKEN)
prompt = "Dark industrial piano set in the 1960s"
additional_prompt = "Imagine you are a pianist."
new_prompt = additional_prompt + " " + prompt
input = f"Take the next sentence and enrich it feeling, keep it compact. {prompt}"

output = client.text_generation(input, max_new_tokens = 50)

print(output)
import torch
from transformers import pipeline

vibes = pipeline("text-to-audio",
                 "facebook/musicgen-stereo-medium",
                 torch_dtype=torch.float16,
                 device="cuda")
music_pipe = pipeline("text-to-audio", model="facebook/musicgen-small") 
out = vibes(oufrom IPython.display import Audio
Audio(out["audio"][0], rate=32000))

Any tips would be highly appreciated