facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.92k stars 2.14k forks source link

Saved files? #6

Open bbecausereasonss opened 1 year ago

bbecausereasonss commented 1 year ago

Awesome work ladies/gents! Where does the gradio interface save generated files?

john-cornell commented 1 year ago

They don't appear to be directly saved, however you can add the following (assuming google.colab here, you may need to find another download method, I'm sure phind.com or GPT can figure it out for you)

I largely cobbled this together from https://huggingface.co/spaces/facebook/MusicGen/blob/main/app.py

from tempfile import NamedTemporaryFile
import gradio as gr
from google.colab import files
#given res is name of output from model.generate, as in the code from 
# https://colab.research.google.com/drive/1fxGqfg96RBUvGxZ1XXN07s3DthrKUl4-?usp=sharing
output = res.detach().cpu().float()[0] 
with NamedTemporaryFile("wb", suffix=".wav", delete=False) as file:
        audio_write(file.name, output, model.sample_rate, strategy="loudness", add_suffix=False)
        waveform_video = gr.make_waveform(file.name)

file.name #If you are at all interested
files.download(file.name)
BigEnding-8347 commented 1 year ago

If you run it locally, it is in fact saved directly in temporary folders without changes as of now.

For Windows 11 using miniconda3, check in 'C:\Users\YOURUSERNAMEHERE\AppData\Local\Temp\gradio'. All these folders you will find are the single results.

To find it in a different OS or environment you can use 'inspect element' when you right click the result-file in a browser when using Gradio. Hope it helps. 👍

SatyaDewangan05 commented 11 months ago

You can redefine the function display_audio in audiocraft.utils.notebook.

First install some libraries

import locale
locale.getpreferredencoding = lambda: "UTF-8"
!pip install PyDub


Now redefine the function

from pydub import AudioSegment
try:
    import IPython.display as ipd  # type: ignore
except ImportError:
    # Note in a notebook...
    pass

import torch

def display_audio(samples: torch.Tensor, sample_rate: int):
    """Renders an audio player for the given audio samples.

    Args:
        samples (torch.Tensor): a Tensor of decoded audio samples
            with shapes [B, C, T] or [C, T]
        sample_rate (int): sample rate audio should be displayed with.
    """
    assert samples.dim() == 2 or samples.dim() == 3

    samples = samples.detach().cpu()
    if samples.dim() == 2:
        samples = samples[None, ...]
    count = 0
    for audio in samples:
        count += 1
        file_name_mp3 = f'music_00{count}.mp3' # audio file_name

        audio = ipd.Audio(audio, rate=sample_rate) # display audio in HTML
        ipd.display(audio)

        audio = AudioSegment(audio.data, frame_rate=22050, sample_width=2, channels=1)
        audio.export(file_name_mp3, format="mp3", bitrate="64k") # save audio


Now everytime you call the display_audio function, it'll store the audio by the names music_001.mp3

display_audio(res, 32000)


You can modify the file_name by using uuid

jaminmc commented 7 months ago

On my mac, It saved them in the folder: /private/var/folders/sq/05t2tpgd3nvbzklh3n827y2c0000gn/T