jzi040941 / PercepNet

Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
BSD 3-Clause "New" or "Revised" License
325 stars 91 forks source link

about pitch coherence #33

Open zoulingqi opened 2 years ago

zoulingqi commented 2 years ago

hello, In your code: Exp[i] = Exp[i]/sqrt(1e-15+Ex[i]*Ep[i]). while bandE[i]=sqrt(sum[i]) in the function compute_band_corr.

I think the code should be if(Ex[i]Ep[i]==0) Exp[i] = 0;(or 1?) else Exp[i] = Exp[i]/(1e-15+Ex[i]Ep[i]). And this can only make pitch coherence{Exp[i]} be 1,when signal{X} and it's periodic component {P} are exactly the same。

jzi040941 commented 2 years ago

I've checked the code in the function compute_band_corr didn't apply sqrt if I change it to bandE[i]=sqrt(sum[i]) than I think the equation Exp[i] = Exp[i]/sqrt(1e-15+Ex[i]Ep[i]). should remain sqrt otherwise your answer Exp[i] = Exp[i]/(1e-15+Ex[i]Ep[i]) (without sqrt) is right

I'm not sure about which Exp[i] value should I take 0 or 1 in case of exception that Ex[i] or Ep[i] is 0. how about changing 1e-15 to very small epsilon 1e-37 would be nice solution I think it won't ruin Exp value that much but also prevent zero division error.

wait for your suggestion thanks.

zoulingqi commented 2 years ago

sorry,I confused the two functions. "bandE[i]=sqrt(sum[i])"is in the fuction compute_band_energy. Ex[i] and Ep[i] is already the ||x|| and ||p||,so Exp[i] should equal to Exp[i]/(1e-37+Ex[i]Ep[i]) instead of Exp[i]/sqrt(1e-37+Ex[i]Ep[i]).

void compute_band_energy(float bandE, const kiss_fft_cpx X) { ...... ...... for (i=0;i<NB_BANDS;i++) { bandE[i] = sqrt(sum[i]); } }