hyli666 / DNN-SpeechEnhancement

54 stars 27 forks source link

How to get the mvn_store.mat? And what's its meaning? #4

Closed ivyyyyyyyy closed 6 years ago

ivyyyyyyyy commented 6 years ago

Dr Li: Hi, I'm studying your code these days, and I still have some problems. Please give me some advice!

  1. How to get the mvn_store.mat? And what's its meaning?
  2. How to transfer *.wav to a matrix?
  3. In DNN_sgd_lps.py, 221st lines, where's "_nat" come from? and "_nat" is an error when I tried to run the code. Hope for your reply.
ivyyyyyyyy commented 6 years ago

One more question. Why Length(noise)>=Length(speech)? And how to make sure it?

hyli666 commented 6 years ago

@ivyyyyyyyy 你好!不好意思回复迟了。另外我不是博士哈~

  1. 我已经上传了mvn_store.mat到我的repository里,它包含了输入向量每个维度的均值和方差,用来归一化神经网络的输入。
  2. wav文件读取后是一个时间序列,通过做短时傅里叶变换可以得到矩阵
  3. _nat是噪声估计帧,在生成训练特征的时候被保存在了data里
  4. 把噪声重复拼在一起确保大于10s,我使用的语音数据每个句子都小于10s

具体建议参考README中的论文!

ivyyyyyyyy commented 6 years ago

@hyli666 感谢您的回复! 我现在已经利用您的代码进行训练,生成了存储w与b的mat文件。但是我不知道要怎样利用生成的w与b进行增强了,您能给我一点意见吗?不好意思,刚接触这一块,不是很懂。

hyli666 commented 6 years ago

@ivyyyyyyyy

增强就是神经网络的前向计算。具体参考论文哈