说话人日志功能求助

wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Apache License 2.0

664 stars 112 forks source link

说话人日志功能求助 #342

Closed panxin801 closed 1 week ago

panxin801 commented 1 month ago

感谢你们公开这么优秀的工作，不过我在使用中发现。我推理一条多说话人数据进行日志任务。命令是这样的

import wespeaker

def main():
    readPath = "./Orphan.mp3"

    model = wespeaker.load_model("chinese")
    diar_result = model.diarize(readPath)
    print(diar_result)

但是推理的结果发音人都是unk。请问unk怎么理解呢，是一个人叫unk。还是多少个人都是未知的，所以叫unk。谢谢您的回答。

JiJiJiang commented 3 weeks ago

请检查一下mp3文件是否是16k 16bits？

panxin801 commented 3 weeks ago

谢谢您的回复我先检查一下，如果没问题了我自己就关掉问题了，谢谢您的解答

xx205 commented 3 weeks ago

谢谢您的回复我先检查一下，如果没问题了我自己就关掉问题了，谢谢您的解答

感谢你们公开这么优秀的工作，不过我在使用中发现。我推理一条多说话人数据进行日志任务。命令是这样的
import wespeaker

def main():
    readPath = "./Orphan.mp3"

    model = wespeaker.load_model("chinese")
    diar_result = model.diarize(readPath)
    print(diar_result)
但是推理的结果发音人都是unk。请问unk怎么理解呢，是一个人叫unk。还是多少个人都是未知的，所以叫unk。谢谢您的回答。

You can download voxceleb_resnet34 and put the extracted files into $HOME/.wespeaker/english (for Windows, %homepath%\.wespeaker\english) then rerun the code with model = wespeaker.load_model("english") and see whether the output is improved.

JiJiJiang commented 2 weeks ago

你是否设置了output_file, 如果是的话，unk是默认的音频名字哈，不是spkid，spkid是最后一列。wespeaker/cli/speaker.py#L204

panxin801 commented 1 week ago

请检查一下mp3文件是否是16k 16bits？

现在是用16k16bit的wav 都是speaker 都是unk

panxin801 commented 1 week ago

你是否设置了output_file, 如果是的话，unk是默认的音频名字哈，不是spkid，spkid是最后一列。wespeaker/cli/speaker.py#L204

Oh，我没设output_file 那看来是最后一列了，谢谢您的解答