Open turian opened 3 months ago
If you mean music embedding, you can just use the standalone MERT model, there is no need to use the whole MU-LLaMA model. If you want to use exactly the embedding that generated from our pipeline, then you may modify the model part to let it output the embedding by yourself.
How can I use your model to get an embedding JUST of the audio file?