Closed Mahaotian1 closed 1 week ago
emotion2vec_plus_seed/base/large works with 5 classes (angry, happy, neutral, sad, unk)
emotion2vec/base/large/base_finetuned works with 9 classes (angry, disgusted, fearful, happy, neutral, other, sad, surprised, unknown)
if i'm right
You can use this pipeline to get 9-class labels:
from funasr import AutoModel
model = AutoModel(model="iic/emotion2vec_plus_large")
wav_file = f"{model.model_path}/example/test.wav"
rec_result = model.generate(wav_file, output_dir="./outputs", granularity="utterance", extract_embedding=False)
print(rec_result)
You can use this pipeline to get 9-class labels:
from funasr import AutoModel model = AutoModel(model="iic/emotion2vec_plus_large") wav_file = f"{model.model_path}/example/test.wav" rec_result = model.generate(wav_file, output_dir="./outputs", granularity="utterance", extract_embedding=False) print(rec_result)
it works,thanks a lot
According to the official description, the motion2vec_plus model should be able to categorize speech into 9 classes, but why did I actually run it with only 5 classes(angry、happy、neutral、sad and)?
usage: inference_pipeline = pipeline( task=Tasks.emotion_recognition, model="iic/emotion2vec_plus_base", device = "gpu:1") rec_result = inference_pipeline("/home/data2/tts_data/raw/opensrc/Emotion Speech Dataset/0007/Neutral/0007_000239.wav", output_dir="./cosyvoice_embedding", granularity="utterance", extract_embedding=False)