Open ZihaoCui opened 1 year ago
or just annotatig the line 166, 167
I came to the same conclusion. The way how th SNR is computed is resulting in snr + 20log(|rmsnoise|/|rmsclean|), when using
noisescalar = rmsclean / (10**(snr/20)) / (rmsnoise+EPS).
The segmental_snr_mixer function in audiolib.py is wrong. Because the normalization is applied, the noisescalar calculation is noisescalar = 1 / (10(snr/20)) not be noisescalar = rmsclean / (10(snr/20)) / (rmsnoise+EPS)
Based on the test file import numpy as np import audiolib params = {'cfg':1, 'target_level_lower':-35,'target_level_upper':-25} temp = [0.1,0.2,0.3,0.4,0.5] for abs_a in range(len(temp)): for abs_b in range(len(temp)): base_a = np.random.binomial(1,temp[abs_a],size=(5000,1)) base_b = np.random.binomial(1,temp[abs_b],size=(5000,1)) print(temp[abs_a], temp[abs_b], audiolib.segmental_snr_mixer(params,base_a,base_b,snr=5))
and add the following code after line 169 in audiolib.py rmsclean1, rmsnoise2 = active_rms(clean=clean, noise=noisenewlevel) snr_test = 20*np.log10(rmsclean1/rmsnoise2) return snr_test, rmsclean, rmsnoise
we have 0.1 0.1 (4.809261190526833, 0.3199999999999999, 0.31304951684997046) 0.1 0.2 (8.398437767900433, 0.3085449724108302, 0.45628938186199325) 0.1 0.3 (9.541254719329668, 0.32557641192199405, 0.5491812087098392) 0.1 0.4 (11.254207673313344, 0.30822070014844877, 0.6332456079595024) 0.1 0.5 (11.975890986196333, 0.3174901573277508, 0.7088018058667739) 0.2 0.1 (1.8278324324274429, 0.4436214602563766, 0.307896086366813) 0.2 0.2 (5.030355204691368, 0.44676615807377346, 0.44833023542919775) 0.2 0.3 (6.760912590556815, 0.4498888751680796, 0.5509990925582363) 0.2 0.4 (8.027724031442382, 0.4460941604639091, 0.6321392251711642) 0.2 0.5 (8.922584416928597, 0.45409250158970904, 0.7133021800050802) 0.3 0.1 (0.009673346077310881, 0.5594640292279744, 0.3149603149604724) 0.3 0.2 (3.2262700519281093, 0.5526300751859239, 0.4505552130427523) 0.3 0.3 (5.140245659899135, 0.5464430436925699, 0.5553377350765928) 0.3 0.4 (6.272472841406196, 0.5479051012721089, 0.6343500610861481) 0.3 0.5 (7.247686760565433, 0.5499090833947007, 0.7123201527403249) 0.4 0.1 (-1.350831784195107, 0.6315061361538776, 0.3039736830714132) 0.4 0.2 (2.0700880682248917, 0.6378087487640788, 0.4551922670696416) 0.4 0.3 (3.875478643051899, 0.6288083968904994, 0.5524490926773252) 0.4 0.4 (4.854129294663155, 0.6463745044476923, 0.6356099432828279) 0.4 0.5 (5.987176033927486, 0.634665266104897, 0.711055553385247) 0.5 0.1 (-1.9690929118423788, 0.7103520254071215, 0.31843366656181304) 0.5 0.2 (1.0023430870343295, 0.707530918052349, 0.44654227123532203) 0.5 0.3 (2.6995947405835983, 0.7142828571371427, 0.5480875842417887) 0.5 0.4 (4.139592830567267, 0.7037044834303671, 0.6373382147651275) 0.5 0.5 (4.885728516741401, 0.707530918052349, 0.6982836100038435)