YuanGongND / gopt

Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".
BSD 3-Clause "New" or "Revised" License
153 stars 28 forks source link

关于第一步数据的问题 #7

Open YLQY opened 2 years ago

YLQY commented 2 years ago

您好,我在使用762数据集的时候,使用默认的conf提取了,feat.scp然后运行python3 local/extract_gop_feats.py,报错kaldi_io.kaldi_io.UnknownVectorHeader: The header contained 'CM ',想请教一下是为什么,以及使用Kaldi提取特征的参数是什么(mfcc,fbank等),已下是762默认的参数配置。

--use-energy=false # use average of log energy, not energy. --num-mel-bins=40 # similar to Google's setup. --num-ceps=40 # there is no dimensionality reduction. --low-freq=20 # low cutoff frequency for mel bins... this is high-bandwidth data, so

there might be some information at the low end.

--high-freq=-400 # high cutoff frequently, relative to Nyquist of 8000 (=7600)

微信图片_20221022163743

YuanGongND commented 2 years ago

Hi there,

Thanks for reporting this.

kaldi_io.kaldi_io.UnknownVectorHeader: The header contained 'CM '

I am not an expert of kaldi_io, but it seems it is complaining about the compressed matrix. Please check if your kaldi_io version is consistent with us (check requirements.txt in the repo). I also uploaded a sample scp file at https://www.dropbox.com/s/53b0p8awapt5f22/feat.scp?dl=1 to help you find where the problem is - if your script can load the sample scp file, then it is your scp file's problem, otherwise it is your kaldi_io's problem. Since I am maintaining multiple open-source repos, I won't have time to debug for all cases.

以及使用Kaldi提取特征的参数是什么(mfcc,fbank等)

I didn't change any hyperparameter for the Kaldi SO762 recipe for the librispeech model. My co-authors conducted the PAII-A and PAII-B feature extraction work.

-Yuan

LyWangPX commented 1 year ago

'CM' can only be read by read_mat. Change https://github.com/YuanGongND/gopt/blob/master/src/extract_kaldi_gop/extract_gop_feats.py#L50 From read_vec_flt_scp to read_mat_scp fixed this.

LyWangPX commented 1 year ago

Even though the read_mat_scp can fix the CM problem, it should not use read_mat_scp at all. The 'CM' indicates the KALDI recipe is not finished correctly. If the process stopped before stage 12, the feature read here would be MFCC instead of GOP. And reading MFCC features requires 'CM' since it is a 2D array. It is advised to check if the feat shape[1] is 40, which indicates it is MFCC. So if this issue is met again, the KALDI recipe was not correctly finished.

I will post another issue to introduce how to finish the KALDI recipe.