Open secslim opened 1 month ago
同样遇到这个错误,assert 2 <= window_size <= len(waveform), "choose a window size {} that is [2, {}]".format( AssertionError: choose a window size 400 that is [2, 160] Environment OS : Linux FunASR Version : 1.1.6 PyTorch Version : 2.3.1
单进程没有这个错误,使用多进程遇到这个错误,有大佬知道大概是哪里出现问题吗?源码要修改哪里?
fix bug: 方法一 升级funasr到1.1.6版本
转到第137行,在调用kaldi.fbank方法中,修改代码为:
mat = kaldi.fbank(
waveform,
num_mel_bins=self.n_mels,
frame_length=min(self.frame_length,waveform_length/self.fs*1000),
frame_shift=self.frame_shift,
dither=self.dither,
energy_floor=0.0,
window_type=self.window,
sample_frequency=self.fs,
snip_edges=self.snip_edges,
)
在我的场景中此法有效
方法二: MyWavFrontend 或者,如果不想直接修改funasr包代码,可以新建一个MYWavFrontend类 ` @tables.register("frontend_classes", "my_wav_frontend") @tables.register("frontend_classes", "MyWavFrontend") class MYWavFrontend(WavFrontend): """Conventional frontend structure for ASR."""
def forward(
self,
input: torch.Tensor,
input_lengths,
**kwargs,
) -> Tuple[torch.Tensor, torch.Tensor]:
batch_size = input.size(0)
feats = []
feats_lens = []
for i in range(batch_size):
waveform_length = input_lengths[i]
waveform = input[i][:waveform_length]
if self.upsacle_samples:
waveform = waveform * (1 << 15)
waveform = waveform.unsqueeze(0)
mat = kaldi.fbank(
waveform,
num_mel_bins=self.n_mels,
# frame_length=self.frame_length,
frame_length=min(self.frame_length,waveform_length/self.fs*1000),
frame_shift=self.frame_shift,
dither=self.dither,
energy_floor=0.0,
window_type=self.window,
sample_frequency=self.fs,
snip_edges=self.snip_edges,
)
if self.lfr_m != 1 or self.lfr_n != 1:
mat = apply_lfr(mat, self.lfr_m, self.lfr_n)
if self.cmvn is not None:
mat = apply_cmvn(mat, self.cmvn)
feat_length = mat.size(0)
feats.append(mat)
feats_lens.append(feat_length)
feats_lens = torch.as_tensor(feats_lens)
if batch_size == 1:
feats_pad = feats[0][None, :, :]
else:
feats_pad = pad_sequence(feats, batch_first=True, padding_value=0.0)
return feats_pad, feats_lens`
并在模型配置文件speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/config.yaml 中修改WavFrontend为MyWavFrontend
谢谢大佬,我试试看
发自我的iPhone
------------------ 原始邮件 ------------------ 发件人: Nixon @.> 发送时间: 2024年8月27日 18:05 收件人: modelscope/FunASR @.> 抄送: secslim @.>, Author @.> 主题: Re: [modelscope/FunASR] python fastapi进行离线文件识别报错choose a window size 400 that is [2, 0] (Issue #2005)
fix bug: 方法一 升级funasr到1.1.6版本
找到/funasr/fronteds/wav_fronted.py/WavFonted 类的forward 方法
转到第137行,在调用kaldi.fbank方法中,修改代码为: mat = kaldi.fbank( waveform, num_mel_bins=self.n_mels, # frame_length=self.frame_length, frame_length=min(self.frame_length,waveform_length/self.fs*1000), frame_shift=self.frame_shift, dither=self.dither, energy_floor=0.0, window_type=self.window, sample_frequency=self.fs, snip_edges=self.snip_edges, ) 在我的场景中此法有效
方法二:
MyWavFrontend
或者,如果不想直接修改funasr包代码,可以新建一个MYWavFrontend类
@tables.register("frontend_classes", "my_wav_frontend") @tables.register("frontend_classes", "MyWavFrontend") class MYWavFrontend(WavFrontend): """Conventional frontend structure for ASR.""" def forward( self, input: torch.Tensor, input_lengths, **kwargs, ) -> Tuple[torch.Tensor, torch.Tensor]: batch_size = input.size(0) feats = [] feats_lens = [] for i in range(batch_size): waveform_length = input_lengths[i] waveform = input[i][:waveform_length] if self.upsacle_samples: waveform = waveform * (1 << 15) waveform = waveform.unsqueeze(0) mat = kaldi.fbank( waveform, num_mel_bins=self.n_mels, # frame_length=self.frame_length, frame_length=min(self.frame_length,waveform_length/self.fs*1000), frame_shift=self.frame_shift, dither=self.dither, energy_floor=0.0, window_type=self.window, sample_frequency=self.fs, snip_edges=self.snip_edges, ) if self.lfr_m != 1 or self.lfr_n != 1: mat = apply_lfr(mat, self.lfr_m, self.lfr_n) if self.cmvn is not None: mat = apply_cmvn(mat, self.cmvn) feat_length = mat.size(0) feats.append(mat) feats_lens.append(feat_length) feats_lens = torch.as_tensor(feats_lens) if batch_size == 1: feats_pad = feats[0][None, :, :] else: feats_pad = pad_sequence(feats, batch_first=True, padding_value=0.0) return feats_pad, feats_lens
并在模型配置文件speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/config.yaml 中修改WavFrontend为MyWavFrontend
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
🐛 Bug
使用FunASR/runtime/python/http/server.py文件进行离线文件识别, 服务端使用两个进程 uvicorn.run( app="fun_test:app", host=args.host, port=args.port, ssl_keyfile=args.keyfile, ssl_certfile=args.certfile, workers=2 ) 当两个用户同时访问,报如下错误 choose a window size 400 that is [2, 0]
Environment