Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
from transformers import pipeline
import scipy
synthesiser = pipeline("text-to-audio", "facebook/musicgen-medium")
music = synthesiser("lo-fi music with a soothing melody", forward_params={"do_sample": True})
scipy.io.wavfile.write("musicgen_out.wav", rate=music["sampling_rate"], data=music["audio"])
During execution, it always looks like the following, no VRAM is used at all
root@linux-mint:/dockerx/audiocraft/src# rocm-smi
========================= ROCm System Management Interface =========================
=================================== Concise Info ===================================
GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%
0 35.0c 12.0W 0Mhz 96Mhz 0% auto 303.0W 3% 0%
====================================================================================
=============================== End of ROCm SMI Log ================================
Environment Linux linux-mint 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux ROCm6.0
CPU:Ryzen 5950x GPU:RX7900XTX running in a container. Container: rocm/pytorch:rocm6.0_ubuntu22.04_py3.9_pytorch_2.0.1
When I run the following code, it is very slow. Also, it seems that VRAM is not used at all. It takes more than 30 minutes to generate a song!
During execution, it always looks like the following, no VRAM is used at all