modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Apache License 2.0
1.02k stars 89 forks source link

关于切分subseg的问题 #93

Closed hao-qiang closed 4 months ago

hao-qiang commented 4 months ago

https://github.com/alibaba-damo-academy/3D-Speaker/blob/9a455b3e429519aae91a63f36ae82f9b41423ad5/egs/3dspeaker/speaker-diarization/local/prepare_subseg_json.py#L47

在划分片段时,当取到音频末尾时,片段时长小于subseg_dur,是否应该从后往前取subseg_dur,即subseg_st = min(ed-subseg_dur, subseg_st)。如果按照当前的代码取到的片段时长极短时, embedding模型是否会报错呢?

yfchenlucky commented 4 months ago

已经修改,感谢建议! https://github.com/alibaba-damo-academy/3D-Speaker/blob/main/egs/3dspeaker/speaker-diarization/local/prepare_subseg_json.py#L47-L48